Google Unveils Gemini 2.5 Deep Think AI: A New Challenger in Complex Reasoning
Google launches Gemini 2.5 Deep Think AI model, emphasizing advanced reasoning capabilities and safety evaluations.
Key Points
- • Gemini 2.5 Deep Think requires extended processing time for complex reasoning tasks.
- • Achieved bronze-level performance on the International Mathematical Olympiad; a research version got gold status.
- • Available to $250/month AI Ultra plan subscribers with plans for broader API access in the future.
- • Commitment to safety and evaluation noted by DeepSeek team before wider release.
Google has officially launched its Gemini 2.5 Deep Think AI model, specifically designed to handle complex reasoning and problem-solving tasks. This model, which debuted at the recent Google I/O conference, differentiates itself from traditional AI by taking significantly longer to process information, often requiring several minutes to formulate answers after considering multiple approaches.
Notably, Gemini 2.5 Deep Think has achieved bronze-level performance on challenges from the International Mathematical Olympiad. A specialized research variant of the model has outperformed expectations, attaining gold-level performance and scoring an impressive 87.6% on the LiveCodeBench benchmark, indicative of its capabilities in competitive coding scenarios. In comparison, earlier versions of the model had a lower performance benchmark of 80.4% during prior testing reported in May. Additionally, on Humanity's Last Exam, which poses advanced topics, Deep Think achieved a 34.8% accuracy rate, setting it apart from competing AI models that scored between 20-25%.
This model is currently available exclusively to subscribers of Google's AI Ultra plan, which costs $250 per month, with daily query limits not yet specified by the company. Google has indicated future plans to provide access to the Deep Think model through its developer API aimed at enterprises and research contexts. This strategic move reflects a competitive landscape in AI development where firms are battling to create systems adept at complex calculations, science-related inquiries, and sophisticated coding challenges.
In a blog announcement, the DeepSeek team emphasized their commitment to safety and thorough evaluation processes for Deep Think, stating, "Because we're defining the frontier with 2.5 Pro DeepThink, we're taking extra time to conduct more frontier safety evaluations and get further input from safety experts." The plan includes gathering feedback from trusted testers before any broader implementation, reinforcing the importance of ensuring safety alongside advancement in AI capability. This initiative marks a significant step in Google’s AI development strategy and the ongoing race among tech companies to enhance AI functionality while addressing safety considerations.