In its latest quest to redefine the artificial intelligence landscape, Google announced Twins Lightning fast considering 2.0a multimodal reasoning model that enables complex problems to be solved each quickly and clearly.
In post on social network XGoogle CEO Sundar Pichai wrote that it was: “Our most thoughtful model ever :)”
And on development documentationGoogle explains: “Think Mode provides greater inference capabilities in its responses than the entry-level Gemini 2.0 Flash model,” which was previously Google’s latest and biggest and launched just eight days ago.
The new model supports only 32,000 input tokens (approx 50-60 pages of text) and can generate 8000 tokens per output response. In the Google AI Studio sidebar, the company claims it is the best solution for “multimodal understanding, inference” and “coding.”
Full details about the model training process, architecture, licensing and costs have not yet been revealed. Currently showing zero token cost in Google AI Studio.
Accessible and clearer reasoning
Unlike OpenAI’s competing o1 and o1 mini reasoning models, Gemini 2.0 allows users to access detailed reasoning via a drop-down menu, offering clearer and more transparent insight into how the model reaches conclusions.
By enabling users to see how decisions are made, Gemini 2.0 addresses long-standing concerns about AI functioning as a “black box” and puts this model – licensing terms still unclear – on par with other open source models offered by competitors.
My early easy tests of the model showed it accurately and quickly (inside one to three seconds) answered several questions that were notoriously difficult for other AI models, comparable to counting the R number in the word “Strawberry.” (See screenshot above).
In one other test, comparing two decimal numbers (9.9 and 9.11), the model systematically broke the problem down into smaller steps, from parsing integers to comparing decimal places.
These results are supported by independent third-party evaluation LM Arenawhich recognized Gemini 2.0 Flash Thinking as the number one model in all LLM categories.
Native support for image uploading and evaluation
As a further improvement over the competing OpenAI o1 family, Gemini 2.0 Flash Thinking is designed for jump image processing.
o1 began as a text model, but has since been prolonged to incorporate image and file transfer evaluation. Both models can currently only return text.
According to the company, Gemini 2.0 Flash Thinking does not currently support grounding with Google Search or integration with other Google applications and external third-party tools. development documentation.
The multimodal capabilities of Gemini 2.0 Flash Thinking expand its potential applications by enabling it to deal with scenarios combining several types of data.
For example, in one test, the model solved a puzzle that required analyzing text and visual elements, demonstrating its versatility in integrating and reasoning across a number of formats.
Developers can use these features through Google AI Studio and Vertex AI, where the model is available for experimentation.
As the artificial intelligence landscape becomes increasingly competitive, Gemini 2.0 Flash Thinking could mark the starting of a new era of problem-solving models. Its ability to handle diverse data types, offer visible reasoning, and performance at scale positions it as a serious contender in the reasoning AI market, rivaling OpenAI’s o1 family and beyond.