Curious how this will fare when playing Pokemon Red.

minimaxir · 2025-12-05T19:39:11 1764963551

Gemini 3 Pro has been playing Pokemon Crystal (which is significantly harder than Red) in a race against Gemini 2.5 Pro: https://www.twitch.tv/gemini_plays_pokemon

Gemini 3 Pro has been making steady progress (12/16 badges) while Gemini 2.5 Pro is stuck (3/16 badges) despite using double the turns and tokens.

theLiminator · 2025-12-05T22:09:09 1764972549

I think what would be interesting is if it could play the game with vision only inputs. That would represent a massive leap multimodal understanding.

euvin · 2025-12-05T19:29:41 1764962981

Yeah the "High frame rate understanding" feature caught my eye, actual real time analysis of live video feeds seems really cool. Also wondering what they mean by "video reasoning/thinking"?

skybrian · 2025-12-05T19:39:39 1764963579

I don’t think it’s real time? The videos were likely taken previously.

danso · 2025-12-05T23:10:27 1764976227

> 3. Turning long videos into action: Gemini 3 Pro bridges the gap between video and code. It can extract knowledge from long-form content and immediately translate it into functioning apps or structured code

I'm curious as to how close these models are to achieving that once long-ago mocked claim (by Microsoft I think?) that AIs could view gameplay video of long lost games and produce the code to emulate them.