News

Google DeepMind Unveils Gemini Robotics Models for Enhanced Robotic Capabilities

Google DeepMind Unveils Gemini Robotics Models for Enhanced Robotic Capabilities

April 02, 2025
Google DeepMind Gemini Robotics AI models robotics multimodal capabilities embodied reasoning safety measures humanoid robots broad task learning
Google DeepMind has developed the Gemini Robotics models, a new family of AI-driven systems designed to improve robots' adaptability, dexterity, and ability to perform complex physical tasks.

Google Gemini Robotics Models Development

Google DeepMind has developed a new family of Gemini Robotics models, designed to enhance the capabilities of robots in performing complex physical tasks with unprecedented adaptability and dexterity. These models build upon the foundation of Gemini 2.0, incorporating fine-tuning with robot-specific data to add physical action to Gemini's multimodal outputs like text, video, and audio.

Key Features

  • Multimodal Capabilities: Gemini Robotics models combine vision, language, and action to improve how robots interact with the world.
  • Adaptability: Robots can adapt to new environments and objects without specific training, thanks to the generalization capabilities of the models.
  • Dexterity: The models enable robots to perform delicate tasks with precision, such as folding origami, packing lunch items, and playing games like Tic-Tac-Toe.
  • Embodied Reasoning: Gemini Robotics-ER focuses on enhanced spatial understanding and embodied reasoning, allowing robots to detect objects, predict trajectories, and generate code to execute actions.

Development Process

The development of Gemini Robotics models involved training on a broad range of tasks, rather than focusing on single-task training. This approach, known as broad task learning, allowed the models to generalize across various tasks and environments. The team conducted extensive testing, including tasks like putting pens into a shoe and performing a slam dunk with a toy basketball hoop, to validate the models' capabilities.

Collaborations and Applications

Google DeepMind is collaborating with partners like Apptronik to integrate these AI models into humanoid robots. The models are designed to adapt to multiple embodiments, including academic-focused robots and humanoid robots, enabling them to perform tasks like packing a lunchbox or wiping a whiteboard in different forms. Potential applications span both consumer and industrial settings, though specific commercial products and timelines for wider availability have not yet been announced.

Safety Measures

Safety is a key consideration in the development of Gemini Robotics models. Google DeepMind has incorporated traditional robotics safeguards and leveraged Gemini's core safety features. The company has also introduced a new dataset called ASIMOV to help researchers measure the safety implications of robotic actions in real-world scenarios.

Future Prospects

The introduction of Gemini Robotics models represents a significant step towards bringing AI into the physical world. While challenges remain in refining robotic dexterity, real-time decision-making, and broader generalization, these models lay the groundwork for AI-driven robots that can assist in homes, workplaces, and beyond.

For more detailed information, you can visit the official blog post and the Maginative article.

Sources

How we built the new family of Gemini Robotics models The Gemini Robotics models are highly dextrous, interactive and general, meaning they can drive robots to react to new objects, environments ...
Inside Google's Gemini: Unveiling the Development of Advanced ... Explore the groundbreaking advancements in robotics with Google's Gemini. This article delves into the innovative development processes, ...
Google DeepMind Unveils Gemini Robotics Models to Bridge AI and ... Google DeepMind has launched Gemini Robotics, a suite of AI models that enables robots to perform complex physical tasks with unprecedented ...