Boston Dynamics’ robot dog now reads gauges and thermometers with Google’s AI

Robots such as Boston Dynamics’ four-legged Spot can now accurately read analog thermometers and pressure gauges while roaming around factories and warehouses. Those improvements come courtesy of Google DeepMind’s newest robotic AI model that aims to enhance robotic capabilities for ‘embodied reasoning’ when interacting with physical environments.

The new Gemini Robotics-ER 1.6 model announced on April 14 performs as a “high-level reasoning model for a robot” that can plan and execute tasks, according to Google DeepMind. This model also unlocks the capability of accurately reading instruments such as complex gauges and doing visual inspections using sight glasses that provide a transparent window to peek inside tanks and pipes—a performance upgrade that came about through Google DeepMind’s ongoing collaboration with robotics company Boston Dynamics.

Boston Dynamics has a keen interest in testing both quadruped and humanoid robotic workers in a wide range of industrial facilities, including the automotive factories of the robotic company’s corporate owner, Hyundai Motor Group. The company’s robot “dog,” Spot, is being trialled as a robotic inspector that roams throughout industrial facilities to check up on everything. Such inspection duties require “complex visual reasoning” to interpret the multiple needles, liquid levels, container boundaries and tick marks, along with text, in various instruments.

The model driving it

To handle such tasks, the Gemini Robotics-ER 1.6 model provides robots with “agentic vision” that combines visual reasoning with the capability of executing code to create a “visual scratchpad” for inspecting and manipulating images. Such agentic vision was introduced in Google’s Gemini 3.0 Flash model back in January 2026.

The agentic vision capability reportedly boosts robotic performance on instrument reading tasks from 23 percent in the older Gemini Robotics-ER 1.5 model to 98 percent in the new Gemini Robotics-ER 1.6 model. For comparison, Gemini 3.0 Flash delivered just 67 percent accuracy.

The baseline Gemini Robotics-ER 1.6 model can still achieve 86 percent accuracy in reading instruments even without agentic vision. That is because the model uses a process of pointing to different elements in a visual image to process complex tasks, such as counting items or identifying the most salient features. It also supposedly delivers an improved “multi-view reasoning” capability that allows a robotic system to use multiple camera streams to better understand its environment.

The model driving it

Leave a Reply Cancel reply

Related Posts

Trump’s plan to redesign every .gov website leads to AI-designed horrors

Defeating the ‘Token Tax’: How Google Gemma 4, NVIDIA, and OpenClaw are Revolutionizing Local Agentic AI: From RTX Desktops to DGX Spark

xAI Launches Standalone Grok Speech-to-Text and Text-to-Speech APIs, Targeting Enterprise Voice Developers