Google Unveils Gemini Robotics and Gemini Robotics-ER, Built to Master the Real World

Tristan Collins March 12, 2025March 12, 2025

google vs ai

Google’s been cooking up something big with their Gemini 2.0 models, and now they’re rolling out robots that could change how we live. These aren’t your average clunky machines — they’re smart, adaptable, and built to handle the real world like we do.

The folks at Google DeepMind have been busy. They’ve taken their Gemini 2.0 tech and turned it into two new AI models for robotics.

One’s called Gemini Robotics, a vision-language-action setup that controls robots directly by mixing physical moves with Gemini’s knack for understanding text, images, and more.

The other, Gemini Robotics-ER, focuses on spatial smarts, letting robot experts plug it into their own systems for what they call ‘embodied reasoning.’ Basically, it’s AI that gets how the world works in 3D.

What’s cool about these robots? They’re not stuck doing one thing. Gemini Robotics can figure out new tasks on its own, even stuff it wasn’t trained for. Think picking up random objects or working in places it’s never seen. Google says it doubles the performance of other top models on a generalization benchmark — whatever that means, it’s clearly a big deal. And it’s not just smart — it’s quick. ‘If an object slips from its grasp, or someone moves an item around, Gemini Robotics quickly replans and carries on,’ they brag in a video demo. Sounds like it could keep up with my messy kitchen.

Then there’s the hands-on stuff. These robots can do tricky tasks we humans take for granted, like folding origami or sealing a snack in a Ziploc bag. ‘Gemini Robotics displays advanced levels of dexterity,’ Google claims in another clip. I’m picturing a robot packing my lunch better than I can. Plus, it’s not picky about its body — it works with different robot types, from two-armed setups like ALOHA 2 to humanoid figures like Apptronik’s Apollo.

The chatty side is wild too. ‘Gemini Robotics: Interactive’ shows it reacting to plain old English — or any language — and tweaking its moves if something changes. ‘It can understand and respond to a much broader set of natural language instructions than our previous models,’ Sundar Pichai says. Imagine telling it ‘Hey, grab that mug,’ and it just does it, no fancy code needed.

Safety’s on their mind too. They’re pairing these models with classic robot safety tricks, like not crashing into people, and adding Gemini’s brain to judge what’s risky. They’ve even got a new ASIMOV dataset to test how safe these bots are in real life. Google’s working with their Responsible Development team and outside experts to keep things chill as they team up with companies like Boston Dynamics and Agility Robots.

Sundar Pichai kicked things off, saying, ‘We’ve always thought of robotics as a helpful testing ground for translating AI advances into the physical world.’

We’ve always thought of robotics as a helpful testing ground for translating AI advances into the physical world. Today we’re taking our next step in this journey with our newest Gemini 2.0 robotics models. They show state of the art performance on two important benchmarks -…
— Sundar Pichai (@sundarpichai) March 12, 2025

Now they’re proving it. These Gemini Robotics models are out with trusted testers, and Google’s partnering with Apptronik to build human-like robots that might one day roam our homes or workplaces.

Tristan Collins

Tristan has a strong interest in the intersection of artificial intelligence and creative expression. He has a background in computer science, and he enjoys exploring the ways in which AI can enhance and augment human creativity. In his writing, he often delves into the ways in which AI is being used to generate original works of fiction and poetry, as well as to analyze and understand patterns in existing texts.