Google's Gemini 2.5 Deep Think Launches with Impressive Performance Metrics

Google's Gemini 2.5 Deep Think model launches, showcasing significant outperforming metrics against rivals.

Key Points

  • • Gemini 2.5 Deep Think launches for $250/month with a multi-agent architecture.
  • • Outperforms Grok 4 and OpenAI o3 in key benchmarks, achieving 34.8% on Humanity's Last Exam.
  • • Secured a gold medal at the International Math Olympiad for advanced problem-solving.
  • • Plans to broaden access via Gemini API for specialized applications across industries.

Google DeepMind has officially launched its advanced AI model, Gemini 2.5 Deep Think, available to Ultra subscription members for $250 per month. This new model boasts a multi-agent architecture, allowing multiple AI agents to collaborate on complex problems in parallel, which significantly enhances its reasoning abilities. As reported, Gemini 2.5 has achieved remarkable results in various benchmarks, suggesting it could redefine the AI landscape as it competes against models like xAI's Grok 4 and OpenAI's o3.

In recent performance evaluations, Gemini 2.5 scored 34.8% on the challenging Humanity’s Last Exam, a benchmark that measures AI's reasoning capabilities. This score surpasses Grok 4's 25.4% and OpenAI's o3 at 20.3%. In programming assessments, Gemini 2.5 achieved an impressive 87.6%, while Grok 4 scored 79% and o3 managed 72%, further solidifying its competitive edge in technical domains.

The model's design has garnered excitement due to its notable innovations. For instance, Gemini 2.5 recently won a gold medal at the International Math Olympiad, showcasing its proficiency in solving intricate mathematical problems. This success highlights the model's potential not just in linear reasoning tasks but in more complex scenarios that require extended computational resources.

Despite its promising capabilities, the multi-agent architecture of Gemini 2.5 presents operational challenges due to its high computational demands. Consequently, such resource-intensive systems tend to be offered only through premium subscription plans. This trend is echoed across the AI industry, with competitors like xAI also producing models like Grok 4 Heavy that require similar high-level subscription access.

Looking to the future, Google plans to extend the functionality of Gemini 2.5 by testing the Gemini API for practical applications, targeting developers and businesses who could benefit from its advanced reasoning capabilities. This strategic move is aimed at expanding the accessibility of multi-agent models across various fields, potentially driving further innovation in artificial intelligence.