NIX Solutions: Google Expands AI Mode with Image Recognition

Last month, Google introduced AI Mode, an AI-powered search chatbot integrated into its proprietary app. Now, AI Mode has gained the ability to “see” images and answer questions about them. This new capability is already available to “millions of new users.”

NIX Solutions

The update enhances the search chatbot by combining a custom version of the Gemini large language model with Lens image recognition technology. With this integration, users can take a screenshot or upload an image and receive a “rich, comprehensive answer with links” related to the content of the original file. Starting today, this innovation is available in the Google app for both Android and iOS devices.

A Google representative emphasized that AI Mode builds on the company’s long-standing work on visual search, which helped make this advancement possible. He added that Gemini’s multimodal capabilities enable the chatbot to interpret the entire scene in an image—recognizing not just individual elements, but also the context in which they appear. This includes understanding relationships between objects, their shapes, colors, positions, and other contextual details.

Enhanced Contextual Understanding Through AI

According to Google, the updated algorithm relies on a method described as the “fan technique.” In this approach, the neural network sends multiple queries to an image and to the objects within it. This process allows the system to generate “incredibly subtle and contextually relevant” answers that reflect a deeper understanding of visual input.

Initially, AI Mode was launched exclusively for Google One AI Premium subscribers, reminds NIX Solutions. However, with the recent update, this feature is now available to a wider user base in the US. The rollout signals Google’s broader intention to make advanced AI tools more accessible.

As this feature continues to evolve, we’ll keep you updated as more integrations become available. The expansion of AI Mode demonstrates Google’s ongoing focus on enhancing search through artificial intelligence and multimodal capabilities.