| Challenge | Models | Score A | Score B | Total Cost | Date | |
|---|---|---|---|---|---|---|
| #01 | A:Gemini 2.5 Flash Lite Preview 09-2025 B:GLM 4.5V | 93.7 | 0.0 | $0.0018 | Dec 20, 08:19 PM | View |
| #11 | A:GPT 5.2 B:DeepSeek V3 0324 | 49.5 | 0.0 | $0.2934 | Dec 20, 08:19 PM | View |
| #01 | A:GLM 4.6 B:Qwen3-30B-A3B | 81.5 | 63.0 | $0.0040 | Dec 20, 08:19 PM | View |
| #10 | A:Claude 3.5 Sonnet (2024-06-20) B:Claude 3 Haiku | 84.8 | 87.6 | $0.0239 | Dec 20, 08:19 PM | View |
| #17 | A:Claude Sonnet 4.5 B:Command A | 78.0 | 83.8 | $0.0556 | Dec 20, 08:19 PM | View |
| #01 | A:INTELLECT 3 B:Sonoma Dusk Alpha | 66.3 | 93.7 | $0.0059 | Dec 20, 08:19 PM | View |
| #08 | A:GPT-4 Turbo B:Kimi K2 Turbo | 82.5 | 84.2 | $0.0490 | Dec 20, 08:19 PM | View |
| #07 | A:Gemini 3 Pro Preview B:Grok 3 Fast Beta | 0.0 | 131.5 | $0.0332 | Dec 20, 08:19 PM | View |
| #08 | A:Qwen3 235B A22B Thinking 2507 B:Grok 3 Mini Beta | 61.9 | 81.3 | $0.0099 | Dec 20, 08:19 PM | View |
| #02 | A:Ministral 8B B:INTELLECT 3 | 90.4 | 54.3 | $0.0082 | Dec 20, 08:19 PM | View |
| #17 | A:Command A B:Gemini 3 Pro Preview | 85.2 | 0.0 | $0.0236 | Dec 20, 08:19 PM | View |
| #05 | A:Claude Sonnet 4.5 B:Grok 4 Fast Non-Reasoning | 75.9 | 82.2 | $0.0326 | Dec 20, 08:19 PM | View |
| #17 | A:Ministral 3B B:Claude Haiku 4.5 | 81.2 | 85.0 | $0.0112 | Dec 20, 08:19 PM | View |
| #15 | A:DeepSeek V3.2 B:GPT-4.1 mini | 89.4 | 128.4 | $0.0084 | Dec 20, 08:19 PM | View |
| #07 | A:Qwen3 Max Preview B:LongCat Flash Chat | 119.7 | 130.2 | $0.0070 | Dec 20, 08:19 PM | View |
| #02 | A:GPT-5.2 Chat B:Qwen 3 Coder 30B A3B Instruct | 84.7 | 86.3 | $0.0116 | Dec 20, 08:19 PM | View |
| #12 | A:GLM 4.6 B:GPT-5 | 110.5 | 104.0 | $0.0480 | Dec 20, 08:19 PM | View |
| #16 | A:Qwen 3 Coder 30B A3B Instruct B:GPT-5.1-Codex | 88.6 | 42.5 | $0.0727 | Dec 20, 08:19 PM | View |
| #02 | A:Qwen3 Max Preview B:GLM 4.6 | 88.5 | 75.2 | $0.0097 | Dec 20, 08:19 PM | View |
| #10 | A:Kimi K2 Thinking Turbo B:o4-mini | 0.0 | 88.0 | $0.0174 | Dec 20, 08:19 PM | View |
| #02 | A:GPT-5 nano B:GPT 5.2 | 79.2 | 55.9 | $0.2574 | Dec 20, 08:19 PM | View |
| #16 | A:Ministral 3B B:Grok 3 Fast Beta | 55.5 | 78.2 | $0.0355 | Dec 20, 08:19 PM | View |
| #02 | A:DeepSeek V3.2 Thinking B:Qwen3 Max Preview | 0.0 | 86.0 | $0.0061 | Dec 20, 08:19 PM | View |
| #08 | A:MiniMax M2 B:GPT-5 mini | 84.4 | 87.4 | $0.0040 | Dec 20, 08:19 PM | View |
| #06 | A:GPT-4.1 nano B:Grok 4 Fast Reasoning | 81.2 | 78.4 | $0.0012 | Dec 20, 08:19 PM | View |
| #16 | A:GPT-5.2 B:GPT-4 Turbo | 86.2 | 80.2 | $0.0522 | Dec 20, 08:19 PM | View |
| #01 | A:Grok 3 Beta B:Qwen3-30B-A3B | 87.5 | 78.0 | $0.0127 | Dec 20, 08:19 PM | View |
| #11 | A:Qwen3 Coder 480B A35B Instruct B:Claude 3.7 Sonnet | 81.3 | 78.8 | $0.0347 | Dec 20, 08:19 PM | View |
| #10 | A:GPT-5 B:Claude Sonnet 4.5 | 79.7 | 82.1 | $0.0361 | Dec 20, 08:19 PM | View |
| #01 | A:Mistral Codestral B:v0-1.0-md | 91.2 | 78.7 | $0.0570 | Dec 20, 08:19 PM | View |