Updated 3 hours agoSources:Text ArenaLiveBench Language
/ Live Benchmarks / Language
Language benchmarks
Chat preference rankings (Text Arena Elo) and language comprehension (LiveBench).
Text Arena
View original source →| # | Model | Score | Input $/M | Output $/M | Context | Votes |
|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.6 ThinkingAnthropic | 1502Elo | $5.00 | $25 | 1M | 27.5K |
| 2 | Claude Opus 4 7 ThinkingAnthropic | 1500Elo | $5.00 | $25 | 1M | 12.9K |
| 3 | Claude Opus 4.6Anthropic | 1498Elo | $5.00 | $25 | 1M | 29.2K |
| 4 | Claude Opus 4 7Anthropic | 1492Elo | $5.00 | $25 | 1M | 13.6K |
| 5 | Muse SparkMeta | 1489Elo | — | — | — | 11.1K |
| 6 | Gemini 3.1 ProGoogle | 1488Elo | $2.00 | $12 | 1M | 34.2K |
| 7 | Gemini 3 ProGoogle | 1486Elo | $2.00 | $12 | 1M | 41.3K |
| 8 | Gpt 5.5 HighOpenAI | 1481Elo | $5.00 | $30 | 1M | 10.2K |
| 9 | Gemini 3.5 FlashGoogle | 1480Elo | $1.50 | $9.00 | 1M | 5.9K |
| 10 | GPT-5.4OpenAI | 1480Elo | $2.50 | $15 | 1M | 21.0K |
| 11 | Gpt 5.5OpenAI | 1478Elo | $5.00 | $30 | 1M | 10.3K |
| 12 | Grok 4.20xAI | 1478Elo | — | — | — | 22.5K |
| 13 | 1477Elo | $1.75 | $14 | 128K | 28.0K | |
| 14 | Qwen3.7 Max PreviewAlibaba | 1475Elo | $2.50 | $7.50 | 1M | 3.7K |
| 15 | 1475Elo | $2.00 | $6.00 | 2M | 21.6K | |
| 16 | 1474Elo | $2.00 | $6.00 | 2M | 21.6K | |
| 17 | Gemini 3 FlashGoogle | 1473Elo | $0.50 | $3.00 | 1M | 30.7K |
| 18 | Ernie 5.1Baidu | 1473Elo | — | — | — | 9.0K |
| 19 | Claude Opus 4.5 ThinkingAnthropic | 1473Elo | $5.00 | $25 | 200K | 37.1K |
| 20 | Glm 5.1Z.ai | 1472Elo | $1.40 | $4.40 | 203K | 12.3K |
| 21 | Gpt 5.5 InstantOpenAI | 1472Elo | $5.00 | $30 | 1M | 10.8K |
| 22 | Claude Sonnet 4.6Anthropic | 1468Elo | $3.00 | $15 | 1M | 20.8K |
| 23 | Claude Opus 4.5Anthropic | 1468Elo | $5.00 | $25 | 200K | 58.9K |
| 24 | Gpt 5.4OpenAI | 1467Elo | $2.50 | $15 | 1M | 22.1K |
| 25 | 1467Elo | — | — | — | 59.4K | |
| 26 | Mimo v2.5 ProXiaomi | 1465Elo | $1.00 | $3.00 | 1M | 9.7K |
| 27 | Qwen3.5 Max PreviewAlibaba | 1464Elo | — | — | — | 17.3K |
| 28 | 1463Elo | $0.50 | $3.00 | 1M | 45.4K | |
| 29 | Kimi K2.6Moonshot | 1462Elo | $0.95 | $4.00 | 262K | 10.3K |
| 30 | Deepseek v4 Pro ThinkingDeepSeek | 1461Elo | $0.43 | $0.87 | 1M | 10.0K |
| 31 | Grok 4.1xAI | 1460Elo | — | — | — | 63.3K |
| 32 | Deepseek v4 ProDeepSeek | 1459Elo | $0.43 | $0.87 | 1M | 10.7K |
| 33 | Qwen3.6 Max PreviewAlibaba | 1457Elo | $1.04 | $6.24 | 262K | 4.2K |
| 34 | GLM-5Z.ai | 1457Elo | $1.00 | $3.20 | 203K | 20.8K |
| 35 | Dola Seed 2.0 ProBytedance | 1456Elo | — | — | — | 30.5K |
| 36 | Gpt 5.1 HighOpenAI | 1455Elo | $1.25 | $10 | 400K | 40.8K |
| 37 | Claude Sonnet 4.5 ThinkingAnthropic | 1454Elo | $3.00 | $15 | 200K | 71.0K |
| 38 | Claude Sonnet 4.5Anthropic | 1454Elo | $3.00 | $15 | 200K | 69.2K |
| 39 | GPT-5.4 MiniOpenAI | 1454Elo | $0.75 | $4.50 | 400K | 19.0K |
| 40 | Gemma 4 31bGoogle | 1451Elo | $0.14 | $0.40 | 262K | 5.8K |
| 41 | Grok 4.3xAI | 1451Elo | $1.25 | $2.50 | 1M | 9.1K |
| 42 | Ernie 5.0 0110Baidu | 1450Elo | — | — | — | 31.6K |
| 43 | Kimi K2.5 ThinkingMoonshot | 1449Elo | $0.60 | $3.00 | — | 30.7K |
| 44 | 1449Elo | — | — | — | 9.8K | |
| 45 | Gpt 5.3 Chat LatestOpenAI | 1449Elo | $1.75 | $14 | 128K | 26.7K |
| 46 | 1449Elo | $15 | $75 | 200K | 49.8K | |
| 47 | MiMo V2 ProXiaomi | 1447Elo | $1.00 | $3.00 | 1M | 19.0K |
| 48 | Claude Opus 4.1Anthropic | 1447Elo | $15 | $75 | 200K | 77.4K |
| 49 | Gemini 2.5 ProGoogle | 1446Elo | $1.25 | $10 | 1M | 118.7K |
| 50 | Qwen 3.5 397BAlibaba | 1445Elo | $0.39 | $2.34 | 262K | 25.9K |
| 51 | 1444Elo | $75 | $150 | 128K | 14.5K | |
| 52 | Qwen3.6 PlusAlibaba | 1444Elo | $0.33 | $1.95 | 1M | 12.1K |
| 53 | 1443Elo | $5.00 | $15 | 128K | 82.5K | |
| 54 | GLM-4.7Z.ai | 1443Elo | $0.40 | $1.75 | 203K | 12.1K |
| 55 | Deepseek v4 Flash ThinkingDeepSeek | 1440Elo | $0.10 | $0.20 | 1M | 10.1K |
| 56 | Gpt 5.2 HighOpenAI | 1439Elo | $1.75 | $14 | 400K | 42.1K |
| 57 | GPT-5.1OpenAI | 1439Elo | $1.25 | $10 | 400K | 43.5K |
| 58 | Gemma 4 26b A4bGoogle | 1438Elo | — | — | — | 5.8K |
| 59 | Gemini 3.1 Flash LiteGoogle | 1436Elo | $0.25 | $1.50 | 1M | 27.8K |
| 60 | GPT-5.2OpenAI | 1436Elo | $1.75 | $14 | 400K | 39.3K |
| 61 | Longcat Flash Chat 2602 ExpMeituan | 1435Elo | — | — | — | 17.0K |
| 62 | Qwen3 Max PreviewAlibaba | 1435Elo | $0.78 | $3.90 | 262K | 27.7K |
| 63 | Gpt 5 HighOpenAI | 1434Elo | $1.25 | $10 | 400K | 32.0K |
| 64 | Deepseek v4 FlashDeepSeek | 1433Elo | $0.10 | $0.20 | 1M | 10.1K |
| 65 | Kimi K2.5 InstantMoonshot | 1432Elo | $0.40 | $1.90 | 262K | 8.2K |
| 66 | 1431Elo | $0.20 | $0.50 | 2M | 52.8K | |
| 67 | O3 2025 04 16OpenAI | 1431Elo | $2.00 | $8.00 | 200K | 59.8K |
| 68 | Kimi K2 TurboMoonshot | 1430Elo | $1.15 | $8.00 | 262K | 56.6K |
| 69 | Mimo v2.5Xiaomi | 1429Elo | $0.40 | $2.00 | 1M | 9.7K |
| 70 | 1427Elo | — | — | — | 3.4K | |
| 71 | Gpt 5 ChatOpenAI | 1427Elo | $1.25 | $10 | 128K | 31.6K |
| 72 | GLM-4.6Z.ai | 1426Elo | $0.43 | $1.74 | 203K | 35.7K |
| 73 | Deepseek v3.2 Exp ThinkingDeepSeek | 1425Elo | $0.27 | $0.41 | 164K | 9.1K |
| 74 | Qwen3 Max 2025 09 23Alibaba | 1424Elo | $0.78 | $3.90 | 262K | 9.2K |
| 75 | Claude Opus 4 20250514 Thinking 16kAnthropic | 1424Elo | $15 | $75 | 200K | 36.9K |
| 76 | DeepSeek V3.2DeepSeek | 1424Elo | $0.25 | $0.38 | 131K | 45.3K |
| 77 | 1423Elo | $0.26 | $1.06 | — | 92.2K | |
| 78 | Deepseek v3.2 ExpDeepSeek | 1423Elo | $0.27 | $0.41 | 164K | 11.9K |
| 79 | Deepseek R1 0528DeepSeek | 1422Elo | $0.50 | $2.15 | 164K | 18.5K |
| 80 | DeepSeek V3.2 ThinkingDeepSeek | 1422Elo | $0.25 | $0.38 | 131K | 39.4K |
| 81 | 1421Elo | $3.00 | $15 | 256K | 6.8K | |
| 82 | 1419Elo | — | — | — | 4.7K | |
| 83 | Kimi K2 0905 PreviewMoonshot | 1418Elo | $0.60 | $2.50 | 262K | 11.8K |
| 84 | Deepseek v3.1DeepSeek | 1418Elo | $1.23 | $4.94 | — | 15.0K |
| 85 | Qwen 3.5 122BAlibaba | 1418Elo | $0.26 | $2.08 | 262K | 23.1K |
| 86 | Hunyuan Hy3 PreviewTencent | 1417Elo | $0.29 | $1.17 | 262K | 5.2K |
| 87 | Deepseek v3.1 Terminus ThinkingDeepSeek | 1417Elo | $0.27 | $0.95 | 164K | 3.5K |
| 88 | Kimi K2 0711 PreviewMoonshot | 1417Elo | $0.60 | $2.50 | 131K | 27.6K |
| 89 | Deepseek v3.1 ThinkingDeepSeek | 1417Elo | $1.23 | $4.94 | — | 11.8K |
| 90 | Deepseek v3.1 TerminusDeepSeek | 1416Elo | $0.27 | $0.95 | 164K | 3.7K |
| 91 | 1416Elo | — | — | — | 3.4K | |
| 92 | Qwen3 Vl 235b A22b InstructAlibaba | 1415Elo | $0.20 | $0.88 | 262K | 11.5K |
| 93 | Mistral Large 3Mistral | 1415Elo | $0.50 | $1.50 | — | 41.7K |
| 94 | Gpt 4.1 2025 04 14OpenAI | 1413Elo | $2.00 | $8.00 | 1M | 51.0K |
| 95 | Claude Opus 4 20250514Anthropic | 1412Elo | $15 | $75 | 200K | 44.2K |
| 96 | 1412Elo | $3.00 | $15 | 131K | 32.9K | |
| 97 | Glm 4.5Z.ai | 1411Elo | $0.60 | $2.20 | 131K | 24.3K |
| 98 | Gemini 2.5 FlashGoogle | 1411Elo | $0.30 | $2.50 | 1M | 118.5K |
| 99 | Claude Haiku 4.5Anthropic | 1410Elo | $1.00 | $5.00 | 200K | 70.9K |
| 100 | Grok 4 0709xAI | 1410Elo | $3.00 | $15 | 256K | 41.4K |
| 101 | Mistral Medium 2508Mistral | 1410Elo | $2.70 | $8.10 | 32K | 88.3K |
| 102 | MiniMax M2.7MiniMax | 1409Elo | $0.28 | $1.20 | 205K | 17.0K |
| 103 | Qwen 3.5 27BAlibaba | 1409Elo | $0.20 | $1.56 | 262K | 22.5K |
| 104 | Gpt 5.4 Nano HighOpenAI | 1406Elo | $0.20 | $1.25 | 400K | 18.4K |
| 105 | 1405Elo | $0.30 | $2.50 | 1M | 32.9K | |
| 106 | 1404Elo | $0.20 | $0.50 | 2M | 18.7K | |
| 107 | Qwen3 235b A22b No ThinkingAlibaba | 1403Elo | $0.46 | $1.82 | 131K | 38.2K |
| 108 | Qwen3 Next 80b A3b InstructAlibaba | 1402Elo | $0.09 | $1.10 | 262K | 22.9K |
| 109 | O1 2024 12 17OpenAI | 1402Elo | $15 | $60 | 200K | 27.8K |
| 110 | Longcat Flash ChatMeituan | 1401Elo | $0.20 | $0.80 | 131K | 11.4K |
| 111 | 1399Elo | $0.15 | $1.50 | 262K | 9.0K | |
| 112 | 1399Elo | $3.00 | $15 | 1M | 35.1K | |
| 113 | Deepseek R1DeepSeek | 1398Elo | $0.70 | $2.50 | 164K | 18.5K |
| 114 | Qwen3.5 35b A3bAlibaba | 1397Elo | $0.14 | $1.00 | 262K | 23.5K |
| 115 | Hunyuan Vision 1.5 ThinkingTencent | 1396Elo | — | — | — | 2.2K |
| 116 | Qwen3.5 FlashAlibaba | 1396Elo | — | — | — | 23.3K |
| 117 | Qwen3 Vl 235b A22b ThinkingAlibaba | 1396Elo | $0.26 | $2.60 | 131K | 7.9K |
| 118 | Deepseek v3 0324DeepSeek | 1395Elo | $3.00 | $4.50 | 33K | 45.5K |
| 119 | Step 3.5 FlashStepfun | 1395Elo | $0.09 | $0.30 | 262K | 28.6K |
| 120 | 1395Elo | — | — | — | 3.7K | |
| 121 | MiniMax M2.5MiniMax | 1394Elo | $0.15 | $1.15 | 205K | 28.9K |
| 122 | 1393Elo | $0.10 | $0.30 | 262K | 40.9K | |
| 123 | Mai 1 PreviewMicrosoft AI | 1393Elo | — | — | — | 17.9K |
| 124 | Gpt 5 Mini HighOpenAI | 1390Elo | $0.25 | $2.00 | 400K | 27.0K |
| 125 | O4 Mini 2025 04 16OpenAI | 1390Elo | $1.10 | $4.40 | 200K | 45.5K |
| 126 | Claude Sonnet 4 20250514Anthropic | 1389Elo | $3.00 | $15 | 1M | 40.3K |
| 127 | O1 PreviewOpenAI | 1388Elo | $15 | $60 | — | 31.1K |
| 128 | Hunyuan T1 20250711Tencent | 1387Elo | — | — | — | 4.7K |
| 129 | mimo-v2-flash (thinking)Xiaomi | 1387Elo | $0.10 | $0.30 | 262K | 11.0K |
| 130 | Qwen 3 CoderAlibaba | 1387Elo | $0.40 | $1.60 | 262K | 25.8K |
| 131 | 1387Elo | $3.00 | $15 | 200K | 38.8K | |
| 132 | Mistral Medium 2505Mistral | 1387Elo | $0.40 | $2.00 | 131K | 33.2K |
| 133 | MiniMax M2.1MiniMax | 1385Elo | $0.29 | $0.95 | 205K | 17.2K |
| 134 | Qwen3 30b A3b Instruct 2507Alibaba | 1383Elo | $0.09 | $0.30 | 262K | 23.8K |
| 135 | Gpt 4.1 Mini 2025 04 14OpenAI | 1382Elo | $0.40 | $1.60 | 1M | 39.4K |
| 136 | Hunyuan Turbos 20250416Tencent | 1382Elo | — | — | — | 10.7K |
| 137 | 1380Elo | $0.10 | $0.40 | 1M | 47.2K | |
| 138 | Glm 4.6vZ.ai | 1378Elo | $0.30 | $0.90 | 131K | 2.8K |
| 139 | Trinity Large PreviewArcee AI | 1376Elo | $0.15 | $0.45 | 131K | 24.9K |
| 140 | Qwen3 235b A22bAlibaba | 1375Elo | $0.46 | $1.82 | 131K | 26.3K |
| 141 | 1375Elo | $0.10 | $0.40 | 1M | 32.9K | |
| 142 | Qwen2.5 MaxAlibaba | 1374Elo | — | — | — | 32.6K |
| 143 | Glm 4.5 AirZ.ai | 1373Elo | $0.13 | $0.85 | 131K | 31.1K |
| 144 | Trinity Large ThinkingArcee AI | 1373Elo | $0.22 | $0.85 | 262K | 16.4K |
| 145 | Claude 3 5 Sonnet 20241022Anthropic | 1372Elo | $3.00 | $15 | 200K | 88.4K |
| 146 | Claude 3 7 Sonnet 20250219Anthropic | 1371Elo | $3.00 | $15 | 200K | 43.2K |
| 147 | Qwen3 Next 80b A3b ThinkingAlibaba | 1369Elo | $0.10 | $0.78 | 262K | 13.7K |
| 148 | Glm 4.7 FlashZ.ai | 1368Elo | $0.06 | $0.40 | 203K | 11.8K |
| 149 | 1367Elo | — | — | — | 25.4K | |
| 150 | Gemma 3 27b ItGoogle | 1366Elo | $0.08 | $0.16 | 131K | 47.6K |
| 151 | Minimax M1MiniMax | 1364Elo | $0.40 | $2.20 | 1M | 35.2K |
| 152 | O3 Mini HighOpenAI | 1363Elo | $1.10 | $4.40 | 200K | 18.6K |
| 153 | 1362Elo | $0.25 | $1.27 | — | 17.0K | |
| 154 | 1361Elo | — | — | — | 7.4K | |
| 155 | Gemini 2.0 Flash 001Google | 1360Elo | $0.10 | $0.40 | 1M | 43.8K |
| 156 | Deepseek v3DeepSeek | 1358Elo | $1.14 | $4.56 | — | 21.8K |
| 157 | Mistral Small 2506Mistral | 1357Elo | $0.10 | $0.30 | 32K | 17.7K |
| 158 | 1357Elo | $0.30 | $0.50 | 131K | 22.7K | |
| 159 | Intellect 3Prime Intellect | 1357Elo | $0.20 | $1.10 | 131K | 5.3K |
| 160 | Command A 03 2025Cohere | 1354Elo | $2.50 | $10 | 256K | 56.3K |
| 161 | Glm 4.5vZ.ai | 1353Elo | $0.60 | $1.80 | 66K | 5.0K |
| 162 | 1353Elo | $0.07 | $0.30 | 1M | 25.0K | |
| 163 | Gpt Oss 120bOpenAI | 1353Elo | $0.04 | $0.18 | 131K | 30.6K |
| 164 | Gemini 1.5 Pro 002Google | 1351Elo | $3.50 | $11 | 2M | 55.6K |
| 165 | 1350Elo | — | — | — | 11.5K | |
| 166 | Hunyuan Turbos 20250226Tencent | 1349Elo | — | — | — | 2.2K |
| 167 | Step 3Stepfun | 1348Elo | $0.57 | $1.42 | 66K | 6.6K |
| 168 | 1348Elo | — | — | — | 2.8K | |
| 169 | O3 MiniOpenAI | 1347Elo | $1.10 | $4.40 | 200K | 57.3K |
| 170 | 1347Elo | $0.60 | $1.80 | 131K | 2.5K | |
| 171 | Qwen3 32bAlibaba | 1347Elo | $0.08 | $0.28 | 131K | 3.9K |
| 172 | Mercury 2Inception AI | 1347Elo | $0.25 | $0.75 | 128K | 3.1K |
| 173 | Ling Flash 2.0Ant Group | 1346Elo | — | — | — | 7.0K |
| 174 | MiniMax M2MiniMax | 1346Elo | $0.26 | $1.00 | 205K | 6.9K |
| 175 | Qwen Plus 0125Alibaba | 1346Elo | $0.40 | $1.20 | 131K | 5.8K |
| 176 | Gpt 4o 2024 05 13OpenAI | 1345Elo | $5.00 | $15 | 128K | 112.9K |
| 177 | 1343Elo | $0.10 | $0.40 | 131K | 3.3K | |
| 178 | Glm 4 Plus 0111Z.ai | 1343Elo | — | — | — | 5.8K |
| 179 | Claude 3 5 Sonnet 20240620Anthropic | 1342Elo | $3.00 | $15 | 200K | 82.4K |
| 180 | Gemma 3 12b ItGoogle | 1342Elo | $0.04 | $0.13 | 131K | 3.8K |
| 181 | Hunyuan Turbo 0110Tencent | 1340Elo | — | — | — | 2.3K |
| 182 | Gpt 5 Nano HighOpenAI | 1337Elo | $0.05 | $0.40 | 400K | 8.3K |
| 183 | Nova 2 LiteAmazon | 1337Elo | $0.30 | $2.50 | 1M | 12.2K |
| 184 | O1 MiniOpenAI | 1337Elo | $1.10 | $4.40 | — | 52.0K |
| 185 | Qwq 32bAlibaba | 1336Elo | $0.50 | $1.00 | 16K | 25.4K |
| 186 | 1335Elo | $2.00 | $10 | 131K | 63.5K | |
| 187 | Gemini Advanced 0514Google | 1335Elo | — | — | — | 50.1K |
| 188 | Gpt 4o 2024 08 06OpenAI | 1335Elo | $2.50 | $10 | 128K | 45.5K |
| 189 | 1334Elo | $4.00 | $4.00 | 33K | 41.4K | |
| 190 | Step 2 16k Exp 202412Stepfun | 1334Elo | — | — | — | 4.8K |
| 191 | 1333Elo | $4.00 | $4.00 | 33K | 59.7K | |
| 192 | 1330Elo | $0.20 | $0.60 | 66K | 12.2K | |
| 193 | Yi Lightning01.AI | 1328Elo | — | — | — | 27.3K |
| 194 | Molmo 2 8bAi2 | 1328Elo | $0.20 | $0.20 | 37K | 805 |
| 195 | 1328Elo | — | — | — | 2.2K | |
| 196 | Qwen3 30b A3bAlibaba | 1327Elo | $0.09 | $0.45 | 131K | 26.5K |
| 197 | 1327Elo | $0.63 | $1.80 | 131K | 40.0K | |
| 198 | Hunyuan Large 2025 02 10Tencent | 1326Elo | — | — | — | 3.7K |
| 199 | Gpt 4 Turbo 2024 04 09OpenAI | 1324Elo | $10 | $30 | 128K | 98.1K |
| 200 | Deepseek v2.5 1210DeepSeek | 1323Elo | — | — | — | 6.8K |
| 201 | Gemini 1.5 Pro 001Google | 1323Elo | $3.50 | $11 | 2M | 79.1K |
| 202 | Claude 3 5 Haiku 20241022Anthropic | 1323Elo | $0.80 | $4.00 | 200K | 70.0K |
| 203 | 1322Elo | $0.40 | $0.70 | 8K | 30.3K | |
| 204 | Gpt 4.1 Nano 2025 04 14OpenAI | 1322Elo | $0.10 | $0.40 | 1M | 6.1K |
| 205 | Claude 3 Opus 20240229Anthropic | 1321Elo | $15 | $75 | 200K | 194.9K |
| 206 | Ring Flash 2.0Ant Group | 1321Elo | — | — | — | 7.2K |
| 207 | Step 1o Turbo 202506Stepfun | 1320Elo | — | — | — | 9.0K |
| 208 | Glm 4 PlusZ.ai | 1319Elo | $0.44 | $1.76 | 205K | 26.1K |
| 209 | Gemma 3n E4b ItGoogle | 1318Elo | $0.06 | $0.12 | 33K | 22.6K |
| 210 | 1318Elo | $0.10 | $0.32 | 131K | 54.7K | |
| 211 | Qwen Max 0919Alibaba | 1318Elo | $1.60 | $6.40 | 33K | 16.5K |
| 212 | Gpt 4o Mini 2024 07 18OpenAI | 1317Elo | $0.15 | $0.60 | 128K | 68.7K |
| 213 | Gpt Oss 20bOpenAI | 1317Elo | $0.03 | $0.14 | 131K | 10.6K |
| 214 | 1317Elo | $0.06 | $0.24 | 262K | 15.5K | |
| 215 | Qwen2.5 Plus 1127Alibaba | 1315Elo | — | — | — | 10.2K |
| 216 | Athene v2 ChatNexusFlow | 1314Elo | — | — | — | 24.7K |
| 217 | Mistral Large 2407Mistral | 1314Elo | $2.00 | $6.00 | 131K | 45.5K |
| 218 | Gpt 4 0125 PreviewOpenAI | 1312Elo | $10 | $30 | 128K | 93.4K |
| 219 | Gpt 4 1106 PreviewOpenAI | 1312Elo | $10 | $30 | 128K | 100.1K |
| 220 | 1311Elo | $0.05 | $0.10 | 131K | 3.2K | |
| 221 | Hunyuan Standard 2025 02 10Tencent | 1311Elo | — | — | — | 3.9K |
| 222 | Gemini 1.5 Flash 002Google | 1309Elo | $0.07 | $0.30 | 1M | 34.9K |
| 223 | 1308Elo | $2.00 | $10 | 131K | 52.6K | |
| 224 | Deepseek v2.5DeepSeek | 1307Elo | — | — | — | 24.6K |
| 225 | MercuryInception AI | 1306Elo | $0.25 | $0.75 | 128K | 2.0K |
| 226 | Athene 70b 0725NexusFlow | 1306Elo | — | — | — | 19.6K |
| 227 | 1305Elo | $0.15 | $0.50 | 66K | 6.0K | |
| 228 | Mistral Large 2411Mistral | 1305Elo | $2.00 | $6.00 | 131K | 28.1K |
| 229 | Magistral Medium 2506Mistral | 1304Elo | $2.00 | $5.00 | 40K | 11.6K |
| 230 | Gemma 3 4b ItGoogle | 1303Elo | $0.04 | $0.08 | 131K | 4.2K |
| 231 | 1303Elo | $0.10 | $0.30 | 32K | 33.2K | |
| 232 | Qwen2.5 72b InstructAlibaba | 1303Elo | $1.20 | $1.20 | — | 39.4K |
| 233 | 1299Elo | $1.20 | $1.20 | 131K | 7.1K | |
| 234 | Hunyuan Large VisionTencent | 1294Elo | — | — | — | 5.4K |
| 235 | 1293Elo | $0.40 | $0.40 | 131K | 55.2K | |
| 236 | Amazon Nova Pro v1.0Amazon | 1290Elo | $0.80 | $3.20 | 300K | 24.7K |
| 237 | Jamba 1.5 LargeAI21 Labs | 1289Elo | $2.00 | $8.00 | 256K | 8.7K |
| 238 | Gemma 2 27b ItGoogle | 1288Elo | $0.65 | $0.65 | 8K | 75.8K |
| 239 | Reka Core 20240904Reka AI | 1287Elo | — | — | — | 7.3K |
| 240 | 1287Elo | — | — | — | 5.7K | |
| 241 | Gpt 4 0314OpenAI | 1286Elo | $30 | $60 | 8K | 54.2K |
| 242 | 1286Elo | — | — | — | 2.8K | |
| 243 | Gemini 1.5 Flash 001Google | 1286Elo | $0.07 | $0.30 | 1M | 62.8K |
| 244 | 1285Elo | — | — | — | 3.7K | |
| 245 | 1285Elo | $0.15 | $0.50 | 66K | 8.5K | |
| 246 | Claude 3 Sonnet 20240229Anthropic | 1280Elo | $3.00 | $15 | 200K | 109.3K |
| 247 | Gemma 2 9b It SimpoPrinceton | 1279Elo | $0.03 | $0.09 | 8K | 10.1K |
| 248 | Nemotron 4 340b InstructNvidia | 1276Elo | — | — | — | 19.7K |
| 249 | Command R Plus 08 2024Cohere | 1276Elo | $2.50 | $10 | 128K | 9.9K |
| 250 | 1275Elo | $0.51 | $0.74 | 8K | 156.9K | |
| 251 | Gpt 4 0613OpenAI | 1274Elo | $30 | $60 | 8K | 88.7K |
| 252 | 1274Elo | $0.05 | $0.08 | 33K | 14.7K | |
| 253 | Glm 4 0520Z.ai | 1273Elo | — | — | — | 9.8K |
| 254 | Reka Flash 20240904Reka AI | 1271Elo | — | — | — | 7.5K |
| 255 | Qwen2.5 Coder 32b InstructAlibaba | 1270Elo | $0.87 | $0.87 | 32K | 5.4K |
| 256 | C4ai Aya Expanse 32bCohere | 1267Elo | — | — | — | 27.1K |
| 257 | Gemma 2 9b ItGoogle | 1266Elo | $0.03 | $0.09 | 8K | 54.6K |
| 258 | Deepseek Coder v2DeepSeek | 1264Elo | $0.14 | $0.28 | 128K | 15.1K |
| 259 | Qwen2 72b InstructAlibaba | 1261Elo | $0.90 | $0.90 | 33K | 37.3K |
| 260 | Command R PlusCohere | 1261Elo | $2.50 | $10 | 128K | 77.6K |
| 261 | Claude 3 Haiku 20240307Anthropic | 1260Elo | $0.25 | $1.25 | 200K | 117.7K |
| 262 | Amazon Nova Lite v1.0Amazon | 1260Elo | $0.06 | $0.24 | 300K | 19.4K |
| 263 | Gemini 1.5 Flash 8b 001Google | 1258Elo | $0.07 | $0.30 | 1M | 35.6K |
| 264 | Phi 4Microsoft | 1256Elo | $0.07 | $0.14 | 16K | 24.1K |
| 265 | 1251Elo | $0.05 | $0.20 | 128K | 3.3K | |
| 266 | Command R 08 2024Cohere | 1249Elo | $0.15 | $0.60 | 128K | 10.1K |
| 267 | Mistral Large 2402Mistral | 1241Elo | $4.00 | $12 | 32K | 62.4K |
| 268 | Amazon Nova Micro v1.0Amazon | 1240Elo | $0.04 | $0.14 | 128K | 19.4K |
| 269 | Jamba 1.5 MiniAI21 Labs | 1239Elo | $0.20 | $0.40 | 256K | 8.9K |
| 270 | Ministral 8b 2410Mistral | 1237Elo | $0.10 | $0.10 | 131K | 4.8K |
| 271 | Gemini Pro Dev ApiGoogle | 1235Elo | $0.35 | $1.05 | 33K | 18.4K |
| 272 | Qwen1.5 110b ChatAlibaba | 1233Elo | — | — | — | 26.2K |
| 273 | Hunyuan Standard 256kTencent | 1233Elo | — | — | — | 2.7K |
| 274 | 1232Elo | — | — | — | 15.4K | |
| 275 | Qwen1.5 72b ChatAlibaba | 1232Elo | — | — | — | 39.3K |
| 276 | Mixtral 8x22b Instruct v0.1Mistral | 1228Elo | $0.90 | $0.90 | 66K | 51.4K |
| 277 | Command RCohere | 1226Elo | $0.15 | $0.60 | 128K | 54.0K |
| 278 | Reka Flash 21b 20240226Reka AI | 1226Elo | — | — | — | 24.8K |
| 279 | Gpt 3.5 Turbo 0125OpenAI | 1223Elo | $0.50 | $1.50 | 16K | 66.2K |
| 280 | C4ai Aya Expanse 8bCohere | 1223Elo | — | — | — | 9.8K |
| 281 | 1223Elo | $0.04 | $0.04 | 8K | 104.6K | |
| 282 | Mistral MediumMistral | 1222Elo | $2.70 | $8.10 | 32K | 34.5K |
| 283 | Gemini ProGoogle | 1221Elo | $0.35 | $1.05 | 33K | 6.4K |
| 284 | 1220Elo | — | — | — | 2.9K | |
| 285 | Yi 1.5 34b Chat01.AI | 1212Elo | — | — | — | 24.1K |
| 286 | Zephyr Orpo 141b A35b v0.1HuggingFace | 1212Elo | — | — | — | 4.7K |
| 287 | 1211Elo | $0.02 | $0.05 | 131K | 49.6K | |
| 288 | 1207Elo | — | — | — | 3.1K | |
| 289 | Qwen1.5 32b ChatAlibaba | 1203Elo | — | — | — | 21.7K |
| 290 | Gpt 3.5 Turbo 1106OpenAI | 1202Elo | $1.00 | $2.00 | 16K | 16.6K |
| 291 | Gemma 2 2b ItGoogle | 1199Elo | — | — | — | 46.6K |
| 292 | Phi 3 Medium 4k InstructMicrosoft | 1197Elo | $0.17 | $0.68 | — | 25.1K |
| 293 | Mixtral 8x7b Instruct v0.1Mistral | 1196Elo | $0.63 | $0.63 | 32K | 73.5K |
| 294 | Dbrx Instruct PreviewDatabricks | 1194Elo | $0.60 | $0.60 | 33K | 32.2K |
| 295 | Internlm2 5 20b ChatInternLM | 1190Elo | $0.00 | $0.00 | 33K | 9.9K |
| 296 | Qwen1.5 14b ChatAlibaba | 1190Elo | $0.30 | $0.30 | — | 17.8K |
| 297 | Wizardlm 70bMicrosoft | 1184Elo | — | — | — | 8.2K |
| 298 | Deepseek Llm 67b ChatDeepSeek | 1183Elo | — | — | — | 4.9K |
| 299 | Yi 34b Chat01.AI | 1183Elo | $0.90 | $0.90 | 4K | 15.5K |
| 300 | 1181Elo | — | — | — | 6.6K | |
| 301 | Openchat 3.5OpenChat | 1181Elo | $0.20 | $0.20 | — | 8.0K |
| 302 | Openchat 3.5 0106OpenChat | 1181Elo | — | — | — | 12.6K |
| 303 | Gemma 1.1 7b ItGoogle | 1180Elo | $0.03 | $0.09 | 8K | 23.9K |
| 304 | Snowflake Arctic InstructSnowflake | 1178Elo | — | — | — | 32.8K |
| 305 | 1178Elo | — | — | — | 3.2K | |
| 306 | Tulu 2 Dpo 70bAllenAI/UW | 1177Elo | — | — | — | 6.5K |
| 307 | Openhermes 2.5 Mistral 7bNousResearch | 1174Elo | $0.17 | $0.17 | — | 5.0K |
| 308 | Vicuna 33bLMSYS | 1172Elo | $0.00 | $0.00 | 2K | 22.5K |
| 309 | Starling Lm 7b BetaNexusflow | 1171Elo | — | — | — | 16.1K |
| 310 | Phi 3 Small 8k InstructMicrosoft | 1170Elo | $0.15 | $0.60 | — | 17.8K |
| 311 | Llama 2 70b ChatMeta | 1170Elo | $0.70 | $2.80 | 4K | 38.5K |
| 312 | Starling Lm 7b AlphaUC Berkeley | 1166Elo | — | — | — | 10.2K |
| 313 | 1166Elo | $0.05 | $0.34 | 131K | 7.9K | |
| 314 | Nous Hermes 2 Mixtral 8x7b DpoNousResearch | 1164Elo | $0.90 | $0.90 | — | 3.8K |
| 315 | Qwq 32b PreviewAlibaba | 1155Elo | $0.50 | $1.00 | 16K | 3.2K |
| 316 | 1155Elo | — | — | — | 6.8K | |
| 317 | Llama2 70b Steerlm ChatNvidia | 1154Elo | — | — | — | 3.6K |
| 318 | Solar 10.7b Instruct v1.0Upstage AI | 1151Elo | $0.30 | $0.30 | — | 4.2K |
| 319 | Dolphin 2.2.1 Mistral 7bCognitive Computations | 1151Elo | $0.50 | $0.50 | 16K | 1.7K |
| 320 | Mpt 30b ChatMosaicML | 1149Elo | — | — | — | 2.6K |
| 321 | Mistral 7b Instruct v0.2Mistral | 1148Elo | $0.20 | $0.20 | 33K | 19.4K |
| 322 | Wizardlm 13bMicrosoft | 1148Elo | $0.30 | $0.30 | — | 7.0K |
| 323 | 1146Elo | — | — | — | 1.3K | |
| 324 | Qwen1.5 7b ChatAlibaba | 1143Elo | $0.20 | $0.20 | — | 4.7K |
| 325 | Phi 3 Mini 4k Instruct June 2024Microsoft | 1142Elo | $0.13 | $0.52 | 4K | 12.3K |
| 326 | Llama 2 13b ChatMeta | 1140Elo | $0.25 | $0.25 | 4K | 19.2K |
| 327 | Vicuna 13bLMSYS | 1140Elo | $0.30 | $0.30 | — | 19.4K |
| 328 | Qwen 14b ChatAlibaba | 1137Elo | — | — | — | 5.0K |
| 329 | Palm 2Google | 1137Elo | $0.50 | $0.50 | 26K | 8.6K |
| 330 | Gemma 7b ItGoogle | 1136Elo | $0.05 | $0.08 | 8K | 8.9K |
| 331 | 1135Elo | $0.35 | $1.40 | 16K | 7.4K | |
| 332 | Zephyr 7b BetaHuggingFace | 1130Elo | $0.15 | $0.15 | 16K | 11.1K |
| 333 | Phi 3 Mini 128k InstructMicrosoft | 1128Elo | $0.13 | $0.52 | — | 20.7K |
| 334 | Phi 3 Mini 4k InstructMicrosoft | 1127Elo | $0.13 | $0.52 | — | 20.1K |
| 335 | 1126Elo | — | — | — | 2.9K | |
| 336 | Zephyr 7b AlphaHuggingFace | 1126Elo | — | — | — | 1.8K |
| 337 | Stripedhyena Nous 7bTogether AI | 1120Elo | $0.20 | $0.20 | — | 5.2K |
| 338 | 1118Elo | $0.70 | $2.80 | 16K | 1.1K | |
| 339 | Gemma 1.1 2b ItGoogle | 1114Elo | — | — | — | 10.9K |
| 340 | Vicuna 7bLMSYS | 1114Elo | $0.20 | $0.20 | — | 6.9K |
| 341 | Smollm2 1.7b InstructHuggingFace | 1113Elo | — | — | — | 2.2K |
| 342 | 1110Elo | $0.03 | $0.20 | 131K | 8.0K | |
| 343 | Mistral 7b InstructMistral | 1109Elo | $0.07 | $0.28 | 4K | 9.0K |
| 344 | Llama 2 7b ChatMeta | 1107Elo | $0.15 | $0.15 | 4K | 14.1K |
| 345 | Gemma 2b ItGoogle | 1092Elo | $0.10 | $0.10 | — | 4.8K |
| 346 | Qwen1.5 4b ChatAlibaba | 1089Elo | $0.10 | $0.10 | — | 7.6K |
| 347 | 1073Elo | $0.20 | $0.20 | — | 6.3K | |
| 348 | Koala 13bUC Berkeley | 1069Elo | — | — | — | 7.0K |
| 349 | Alpaca 13bStanford | 1067Elo | — | — | — | 5.7K |
| 350 | Gpt4all 13b SnoozyNomic AI | 1065Elo | — | — | — | 1.7K |
| 351 | Mpt 7b ChatMosaicML | 1061Elo | — | — | — | 3.9K |
| 352 | Chatglm3 6bTsinghua | 1055Elo | — | — | — | 4.7K |
| 353 | RWKV 4 Raven 14BRWKV | 1040Elo | — | — | — | 4.8K |
| 354 | Chatglm2 6bTsinghua | 1023Elo | — | — | — | 2.7K |
| 355 | Oasst Pythia 12bOpenAssistant | 1021Elo | — | — | — | 6.3K |
| 356 | Chatglm 6bTsinghua | 994Elo | — | — | — | 4.9K |
| 357 | Fastchat T5 3bLMSYS | 990Elo | — | — | — | 4.2K |
| 358 | Dolly v2 12bDatabricks | 979Elo | — | — | — | 3.4K |
| 359 | Llama 13bMeta | 972Elo | $0.23 | $0.23 | — | 2.4K |
| 360 | Stablelm Tuned Alpha 7bStability | 952Elo | — | — | — | 3.3K |
LiveBench Language
View original source →| # | Model | Score | Input $/M | Output $/M | Context | CI |
|---|---|---|---|---|---|---|
| 1 | GPT-5.5 Thinking xHigh EffortOpenAI | 87.7% | — | — | — | — |
| 2 | Gemini 3.1 Pro Preview HighGoogle | 85.4% | — | — | — | — |
| 3 | Gemini 3 Pro Preview HighGoogle | 84.6% | — | — | — | — |
| 4 | Gemini 3.5 Flash HighGoogle | 84.6% | — | — | — | — |
| 5 | Gemini 3 Flash Preview HighGoogle | 84.6% | — | — | — | — |
| 6 | Claude 4.6 Opus Thinking High EffortAnthropic | 83.3% | — | — | — | — |
| 7 | GPT-5.4 Thinking xHigh EffortOpenAI | 82.6% | — | — | — | — |
| 8 | Claude 4.5 Opus Thinking High EffortAnthropic | 81.3% | — | — | — | — |
| 9 | GPT-5 ProOpenAI | 80.7% | — | — | — | — |
| 10 | GPT-5.3 Codex HighOpenAI | 80.1% | — | — | — | — |
| 11 | GPT-5.2 HighOpenAI | 79.8% | — | — | — | — |
| 12 | Qwen 3.7 MaxAlibaba | 79.7% | — | — | — | — |
| 13 | GPT-5.1 HighOpenAI | 79.3% | — | — | — | — |
| 14 | Claude 4.5 Opus Medium EffortAnthropic | 78.7% | — | — | — | — |
| 15 | DeepSeek V4 ProDeepSeek | 78.1% | — | — | — | — |
| 16 | Claude 4.7 Opus Thinking xHigh EffortAnthropic | 77.9% | — | — | — | — |
| 17 | Grok 4.20 BetaxAI | 77.7% | — | — | — | — |
| 18 | Kimi K2.5 ThinkingMoonshot AI | 77.7% | — | — | — | — |
| 19 | GLM 5Z.AI | 77.5% | — | — | — | — |
| 20 | Claude 4.1 OpusAnthropic | 76.8% | — | — | — | — |
| 21 | GPT-5.1 Codex Max HighOpenAI | 76.5% | — | — | — | — |
| 22 | Claude Sonnet 4.5 ThinkingAnthropic | 76.5% | — | — | — | — |
| 23 | Grok 4xAI | 76.4% | — | — | — | — |
| 24 | Claude 4.6 Sonnet Thinking Medium EffortAnthropic | 76.1% | — | — | — | — |
| 25 | Claude Sonnet 4.5Anthropic | 76.0% | — | — | — | — |
| 26 | GPT-5 Mini HighOpenAI | 75.5% | — | — | — | — |
| 27 | Gemini 2.5 Pro (Max Thinking)Google | 75.5% | — | — | — | — |
| 28 | Kimi K2.6 ThinkingMoonshot AI | 75.1% | — | — | — | — |
| 29 | Qwen 3.6 PlusAlibaba | 75.0% | — | — | — | — |
| 30 | Grok 4.1 FastxAI | 74.3% | — | — | — | — |
| 31 | GPT-5.2 CodexOpenAI | 73.7% | — | — | — | — |
| 32 | Grok 4.3xAI | 73.6% | — | — | — | — |
| 33 | Gemini 3.1 Flash Lite Preview HighGoogle | 73.2% | — | — | — | — |
| 34 | Claude 4 Sonnet ThinkingAnthropic | 72.9% | — | — | — | — |
| 35 | Claude 4.1 Opus ThinkingAnthropic | 72.8% | — | — | — | — |
| 36 | GLM 5.1Z.AI | 71.8% | — | — | — | — |
| 37 | GPT-5.4 Mini xHighOpenAI | 71.5% | — | — | — | — |
| 38 | Gemma 4 31BGoogle | 71.3% | — | — | — | — |
| 39 | DeepSeek V3.2 Exp ThinkingDeepSeek | 71.1% | — | — | — | — |
| 40 | Claude 4 SonnetAnthropic | 71.0% | — | — | — | — |
| 41 | DeepSeek V3.2 ThinkingDeepSeek | 70.4% | — | — | — | — |
| 42 | DeepSeek V4 FlashDeepSeek | 70.1% | — | — | — | — |
| 43 | GPT-5.3 InstantOpenAI | 70.0% | — | — | — | — |
| 44 | Qwen 3 235B A22B Thinking 2507Alibaba | 69.5% | — | — | — | — |
| 45 | GPT-5.1 CodexOpenAI | 69.5% | — | — | — | — |
| 46 | MiMo V2 ProXiaomi | 69.1% | — | — | — | — |
| 47 | Minimax M2.7Minimax | 66.8% | — | — | — | — |
| 48 | Kimi K2 InstructMoonshot AI | 66.7% | — | — | — | — |
| 49 | Kimi K2 ThinkingMoonshot AI | 66.5% | — | — | — | — |
| 50 | Claude Haiku 4.5 ThinkingAnthropic | 66.5% | — | — | — | — |
| 51 | Qwen 3 Next 80B A3B InstructAlibaba | 66.3% | — | — | — | — |
| 52 | Qwen 3 235B A22B Instruct 2507Alibaba | 66.1% | — | — | — | — |
| 53 | DeepSeek V3.2 ExpDeepSeek | 65.6% | — | — | — | — |
| 54 | Gemini 2.5 Flash (Max Thinking) (2025-09-25)Google | 65.3% | — | — | — | — |
| 55 | GLM 4.7Z.AI | 65.2% | — | — | — | — |
| 56 | DeepSeek V3.2DeepSeek | 64.2% | — | — | — | — |
| 57 | Qwen 3.6 27BAlibaba | 63.3% | — | — | — | — |
| 58 | Qwen 3.6 FlashAlibaba | 63.1% | — | — | — | — |
| 59 | GPT-5.1 Codex MiniOpenAI | 63.0% | — | — | — | — |
| 60 | GPT-5.4 Nano xHighOpenAI | 62.5% | — | — | — | — |
| 61 | GLM 5V TurboZ.AI | 62.3% | — | — | — | — |
| 62 | Gemini 2.5 Flash (Max Thinking) (2025-06-05)Google | 62.3% | — | — | — | — |
| 63 | GLM 4.6Z.AI | 59.0% | — | — | — | — |
| 64 | Claude Haiku 4.5Anthropic | 57.0% | — | — | — | — |
| 65 | Qwen 3 Next 80B A3B ThinkingAlibaba | 56.3% | — | — | — | — |
| 66 | Qwen 3 32BAlibaba | 55.5% | — | — | — | — |
| 67 | Minimax M2.5Minimax | 55.1% | — | — | — | — |
| 68 | Qwen 3 30B A3BAlibaba | 54.5% | — | — | — | — |
| 69 | GPT-5.1 No ThinkingOpenAI | 53.8% | — | — | — | — |
| 70 | Gemini 2.5 Flash Lite (Max Thinking) (2025-09-25)Google | 52.6% | — | — | — | — |
| 71 | Gemini 2.5 Flash Lite (Max Thinking) (2025-06-17)Google | 52.0% | — | — | — | — |
| 72 | Grok 4.1 Fast (Non-Reasoning)xAI | 50.0% | — | — | — | — |
| 73 | GPT-5.2 No ThinkingOpenAI | 50.0% | — | — | — | — |
| 74 | GLM 4.6VZ.AI | 49.7% | — | — | — | — |
| 75 | GPT OSS 120bOpenAI | 48.6% | — | — | — | — |
| 76 | Grok Code FastxAI | 48.6% | — | — | — | — |
| 77 | GPT-5 Nano HighOpenAI | 46.8% | — | — | — | — |
| 78 | Devstral 2Mistral | 45.7% | — | — | — | — |
| 79 | Trinity Large PreviewArcee | 42.1% | — | — | — | — |
| 80 | Grok 4.20 Beta (Non-Reasoning)xAI | 42.0% | — | — | — | — |
| 81 | Nemotron 3 Super 120B A12BNVIDIA | 30.0% | — | — | — | — |
| 82 | Elephant AlphaOpenRouter | 27.8% | — | — | — | — |
/ Live Benchmarks
Need help choosing the right AI model for your business?
Benchmarks are a starting point, not an answer. The right model depends on your workload, budget, and integration constraints — let's figure it out together.