LSP Benchmark Results | Python Type Coverage

📈 Type Checker Comparison

Average Latency (ms)

Lower is better - time to resolve "Go to Definition"

Success Rate (%)

Higher is better - valid definitions found

Latency by Package

Average "Go to Definition" latency per package (lower is better)

📋 Detailed Results

Package	Type Checker	Avg Latency	P50 Latency	P95 Latency	Success %
Loading results...

📐 Methodology

🎯 What We Measure

We benchmark the textDocument/definition LSP request (Go to Definition) across different type checker language servers. This is one of the most commonly used IDE features for code navigation.

🔄 Test Process

Clone the top Python packages from GitHub
Start all type checker LSP servers in parallel
Pick random Python files and identifier positions
Send identical "Go to Definition" requests to all servers simultaneously
Measure latency and verify if returned locations are valid

⏱️ Timeout: Requests have a 10-second timeout. Timeouts are counted as failures and excluded from latency statistics.

🔄 Parallel Execution: All type checkers receive identical test cases and run simultaneously. No warmup runs are performed.

📊 Metrics Explained

Latency: Time (ms) from request to response (excludes timeouts)
P50/P95: 50th/95th percentile latencies
Success %: Percentage of requests that returned a valid definition location pointing to a real file

🏷️ Type Checkers & Language Servers

Pyright Type checker and language server for Python by Microsoft

Pyrefly Type checker and language server for Python by Meta

ty Type checker and language server for Python by Astral

Zuban Type checker and language server for Python by the creator of Jedi LSP