| Challenge | Models | Score A | Score B | Total Cost | Date | |
|---|---|---|---|---|---|---|
| #10 | A:GPT 5.2 B:Claude 3 Opus | 56.9 | 67.0 | $0.2980 | Dec 20, 08:19 PM | View |
| #11 | A:Sonoma Dusk Alpha B:Devstral Small 2 | 82.8 | 86.8 | $0.0013 | Dec 20, 08:19 PM | View |
| #08 | A:Mistral Large B:Devstral Small 2 | 86.7 | 80.0 | $0.0078 | Dec 20, 08:19 PM | View |
| #06 | A:Claude Sonnet 4 B:GPT-4.1 | 64.6 | 85.6 | $0.0696 | Dec 20, 08:19 PM | View |
| #08 | A:GPT-5.2 Chat B:Ministral 3B | 90.6 | 91.2 | $0.0094 | Dec 20, 08:19 PM | View |
| #13 | A:LongCat Flash Chat B:Qwen3 Coder Plus | 131.5 | 128.2 | $0.0105 | Dec 20, 08:19 PM | View |
| #01 | A:LongCat Flash Chat B:Claude 3.5 Sonnet (2024-06-20) | 89.0 | 90.7 | $0.0152 | Dec 20, 08:18 PM | View |
| #03 | A:GPT-4.1 nano B:GPT-4.1 | 90.6 | 89.9 | $0.0081 | Dec 20, 08:18 PM | View |
| #10 | A:Llama 3.1 70B Instruct B:INTELLECT 3 | 86.7 | 67.3 | $0.0061 | Dec 20, 08:18 PM | View |
| #02 | A:GPT-4.1 B:Gemini 2.0 Flash Lite | 90.5 | 0.0 | $0.0100 | Dec 20, 08:18 PM | View |
| #18 | A:Devstral Small 1.1 B:GPT-5.1-Codex | 132.8 | 132.7 | $0.0167 | Dec 20, 08:18 PM | View |
| #04 | A:Claude 3 Haiku B:v0-1.5-md | 137.2 | 93.4 | $0.0968 | Dec 20, 08:18 PM | View |
| #16 | A:Ministral 3B B:Ministral 8B | 0.0 | 87.6 | $0.0005 | Dec 20, 08:18 PM | View |
| #10 | A:GPT-4 Turbo B:Codex Mini | 79.6 | 83.9 | $0.0468 | Dec 20, 08:18 PM | View |
| #05 | A:Mistral Small B:GLM 4.5 | 82.3 | 70.2 | $0.0061 | Dec 20, 08:18 PM | View |
| #16 | A:GPT-5 Chat B:GPT 5.2 | 0.0 | 56.6 | $0.1768 | Dec 20, 08:18 PM | View |
| #13 | A:Sonoma Dusk Alpha B:Grok 3 Mini Beta | 138.8 | 119.8 | $0.0021 | Dec 20, 08:18 PM | View |
| #17 | A:Devstral 2 B:Gemini 2.5 Flash Preview 09-2025 | 0.0 | 81.6 | $0.0040 | Dec 20, 08:18 PM | View |
| #11 | A:Claude Sonnet 4 B:Nvidia Nemotron Nano 9B V2 | 77.3 | 0.0 | $0.0312 | Dec 20, 08:18 PM | View |
| #04 | A:GPT-5.1-Codex B:DeepSeek V3 0324 | 126.0 | 138.3 | $0.0271 | Dec 20, 08:15 PM | View |
| #06 | A:MiniMax M2 B:GPT 5.1 Codex Max | 69.2 | 86.8 | $0.0160 | Dec 20, 08:15 PM | View |
| #09 | A:GLM-4.6V B:GLM-4.6V-Flash | 153.8 | 151.6 | $0.0061 | Dec 20, 08:15 PM | View |
| #16 | A:Gemini 2.5 Pro B:Gemini 3 Pro Preview | 65.0 | 0.0 | $0.0128 | Dec 20, 08:15 PM | View |
| #09 | A:o3-mini B:Kimi K2 Thinking | 178.3 | 0.0 | $0.0082 | Dec 20, 08:15 PM | View |
| #01 | A:GPT-5.2 Chat B:Claude Sonnet 4.5 | 91.6 | 88.4 | $0.0246 | Dec 20, 08:15 PM | View |
| #12 | A:GPT 5.2 B:Qwen3 Coder Plus | 103.3 | 109.6 | $0.3293 | Dec 20, 08:15 PM | View |
| #02 | A:Qwen3 Coder 480B A35B Instruct B:Mistral Codestral | 86.1 | 87.1 | $0.0041 | Dec 20, 08:14 PM | View |
| #05 | A:o4-mini B:MiniMax M2 | 78.2 | 80.7 | $0.0160 | Dec 20, 08:14 PM | View |
| #04 | A:v0-1.5-md B:Command A | 92.5 | 130.3 | $0.1184 | Dec 20, 08:14 PM | View |
| #01 | A:GPT-4.1 nano B:Qwen 3.32B | 81.6 | 92.5 | $0.0007 | Dec 20, 08:14 PM | View |