Claude 3.7 Sonnet Can Play Pokémon?! AI Benchmarks Are Getting Weird
Apart from excelling at reasoning & coding, Claude 3.7 Sonnet apparently outperformed all models in Pokémon gameplay tests.
First, it was Chess & Go as AI benchmarks. Then, we got Dota 2 & Starcraft II. Now… Pokémon??
What’s next—Claude vs AlphaGo in competitive Uno? 😂