Claude 3.7 Sonnet Leads in SWE-bench Best AI for Software Engineering
📢 Claude 3.7 Sonnet just dominated the SWE-bench verified leaderboard for software engineering!
🛠️ Key results:
62.3% accuracy on SWE-bench (70.3% with custom scaffolding)
Outperforms Claude 3.5 Sonnet, OpenAI's o1, and DeepSeek R1 (all under 50%)
Big upgrade for real-world coding tasks, debugging, and full-stack updates
🔥 Let’s hear your thoughts!
📢 Claude 3.7 Sonnet just dominated the SWE-bench verified leaderboard for software engineering!
🛠️ Key results:
62.3% accuracy on SWE-bench (70.3% with custom scaffolding)
Outperforms Claude 3.5 Sonnet, OpenAI's o1, and DeepSeek R1 (all under 50%)
Big upgrade for real-world coding tasks, debugging, and full-stack updates
🔥 Let’s hear your thoughts!