One of the World’s Most Advanced AI Agents Is Completely Stuck Trying to Beat a Pokémon Game for Children


In case you haven’t heard, Anthropic has been livestreaming its AI model, Claude 3.7 Sonnet, attempting to complete a playthrough of Pokémon Red.

The experiment, dubbed “Claude Plays Pokémon,” is intended to be a demonstration of “AI agents,” the industry’s ongoing race to create AI models that are capable of operating autonomously by interacting with their environment.

Claude has managed to get surprisingly far into the game, clinching three Gym badges and reaching, as of this week, Cerulean City. But it plods along at a painstakingly slow pace, stopping to “think” after every single move, sometimes for longer intervals than others. For nearly 80 agonizing hours, for instance, Claude bumbled cluelessly around Mt. Moon, before finally finding the ladder it needed to escape. Invested Twitch viewers breathed a sigh of relief.

Progress isn’t looking poised to speed up. The Anthropic AI’s excursion through the Kanto region has mostly devolved into running around in circles, unsure of its next move. It needs to hop on Route 5 to reach the next stage, but where and how?

A text window in the livestream of Claude’s thought process shows that the AI is using a process of elimination to rule out which locations aren’t the Route 5 entrance. But will it piece together that it needs to use the HM “Cut” on a few destructible trees to access the fabled path? It’s not looking likely: it keeps repeating how it needs to find the “gatehouse” to the route instead. 

In short, Claude is stuck. One of the AI industry’s leading models may well be stumped by a game that’s been beaten by literal children for generations.

According to engineers, a major challenge for Claude is visually processing what it sees in the game. Claude excels at interpreting the game’s text-based portions, including the Pokémon battles. It also has access to the game’s RAM to glean information like its in-game coordinates. But it can’t consistently interpret the tiny number of pixels that make up its low-res environment.

“Claude’s still not particularly good at understanding what’s on the screen at all,” David Hershey, the Anthropic engineer behind the Pokémon experiment, told Ars Technica in a recent interview. “You will see it attempt to walk into walls all the time.” Ironically, Hershey suggests, if Claude was playing a more visually realistic game, it might do better.

“It’s pretty easy for me to understand that [an in-game] building is a building and that I can’t walk through a building,” Hershey added. “And that’s [something] that’s pretty challenging for Claude to understand.”

There are times, however, when Claude is surprisingly clever, like responding to in-game clues that are designed to be misleading.

“It’s pretty funny that they tell you you need to go find Professor Oak next door and then he’s not there,” Hershey told Ars, describing one of the first missions in the game. “As a 5-year-old, that was very confusing to me. But Claude actually typically goes through that same set of motions where it talks to mom, goes to the lab, doesn’t find [Oak], says, ‘I need to figure something out.'”

“It’s sophisticated enough to sort of go through the motions of the way [humans are] actually supposed to learn it, too,” Hershey added.

So maybe all is not lost yet. There’s still plenty of time for Claude 3.7 Sonnet to turn things around. It’s gotten significantly farther than its predecessor 3.0 Sonnet, which couldn’t even make it out of Pallet Town, the game’s starting area. Still, its struggles show that the technology still has a long way to go to be “agentic,” let alone fulfill its promise of one day exceeding human capabilities.

More on gaming: Voice Actor for Aloy in “Horizon” Games Creeped Out by AI Version of Her Character



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

NASA reestablishes contact with one of two TRACERS satellites

WASHINGTON — NASA has restored contact with one of a pair of space...

tata technologies: Tata Technologies to fully acquire ES-Tec Group for nearly Rs 775 crore

Global product engineering and digital services firm Tata Technologies on Saturday said it...

Albania Appoints an AI as Government Official

Albania has appointed the world's first-ever AI government official in hopes of rooting...
sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online Sabung Ayam Online Sv388 Sv388 SV388 sabung ayam online sabung ayam online Sabung Ayam Online sabung ayam online sabung ayam online sabung ayam online Sabung ayam online Sabung ayam online SV388 sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online SV388 sabung ayam online SV388 SV388 Sabung Ayam Online Sabung Ayam Online Sabung Ayam Online Sabung Ayam Online Sv388 SV388 SV388 sabung ayam online sv388 sv388 sabung ayam online sv388
judi bola judi bola Judi bola SBOBET judi bola judi bola judi bola Judi Bola Online judi bola judi bola judi bola judi bola judi bola judi bola juara303 juara303 Judi bola online judi bola judi bola judi bola judi bola judi bola judi bola judi bola judi bola SBOBET judi bola judi bola judi bola Judi Bola SBOBET88 SBOBET88 judi bola judi bola judi bola JUDI BOLA ONLINE JUDI BOLA ONLINE SBOBET88 Judi Bola Judi Bola judi bola judi bola judi bola judi bola judi bola Judi Bola Online judi bola judi bola judi bola judi bola mix parlay
CASINO ONLINE SLOT GACOR live casino mahjong ways Live Casino Online Slot Gacor Mahjong Ways slot pulsa Casino Online Slot Gacor Mix Parlay live casino online live casino online LIVE CASINO ONLINE LIVE CASINO ONLINE slot pulsa slot pulsa slot pulsa Mpo Slot
https://ejurnal.staidarulkamal.ac.id/ https://doctorsnutritionprogram.com/ https://nielsen-restaurante.com/ https://www.atobapizzaria.com.br/ https://casadeapoio.com.br/ https://bracoalemao.com.br/ https://letspetsresort.com.br/ https://mmsolucoesweb.com.br/ https://procao.com.br/
Rahasia Kemenangan di Mahjong Wild Pemain Tidak Menyangka Pola Scatter Jangan Anggap Remeh Mahjong Wild Pemain Pemula Heran Setelah Coba Mahjong Wild Menemukan Pola Rahasia yang Bikin Scatter Muncul Pola Scatter Rahasia yang Baru Terbongkar Pola Rahasia Pemain Pemula Terbongkar Mereka Ketagihan Karena Sering Dapat Kemenangan Mereka Ketagihan Karena Sering Dapat Kemenangan Trik Sederhana Saat Taruhan Kecil Pola Wild Liar Tersembunyi Bisa Menggandakan uang Pola Rahasia Baru Bisa Menghasilkan Wild Buktikan Pola Wild Liar dan Scatter Hitam Kaya Setelah Main Mahjong Wild Pria Asal Nepal Obrak-Abarik Kantor DPR