Not long ago, it was announced that it would launch a program that boasts considerable capabilities in programming, mathematical reasoning, and scientific analysis.Gemini 2.5 Pro experimental version Following the Experimental release, Google further announced at the Google NEXT'25 event the launch of Gemini 2.5 Flash, which offers even lower latency and higher cost-efficiency. This will also be available through Google Cloud's Vertex AI platform and Google AI Studio.
Compared to Gemini 2.5 Pro, which can process up to 100 million words of content understanding and can perform in-depth data analysis, provide key insights in specific professional fields, or perform complex coding work after understanding the entire code, making it Google's most capable artificial intelligence model, Gemini 2.5 Flash provides lower latency execution efficiency and lower usage costs. It is expected to become the main usage model for most application services, while also maintaining a certain execution accuracy performance, making it suitable for creating interactive virtual assistants or real-time content summarization tools.
Gemini 2.5 Flash also features dynamic, controllable reasoning capabilities that automatically adjust processing time based on the complexity of the question content (which can be considered a "thinking budget"), enabling faster interactions for questions with simple responses. Developers or businesses can also set usage costs and adjust response speed and accuracy based on actual needs, allowing service operation budgets to be used more efficiently.
At the same time, to make it easier for users to choose the appropriate version of models between Gemini 2.5 Pro and Gemini 2.5 Flash, Google launched the experimental Vertex AI model optimization tool, which can automatically generate the best quality response results for each prompt based on the user's expected execution quality and cost.
To address the need for workloads that do not need to be processed at fixed network node locations, Google has also launched cross-regional traffic-aware routing called Vertex AI Global Endpoint, which can ensure that the Gemini artificial intelligence model maintains a certain level of response efficiency even under high peak access traffic or unstable regional network services.
In addition, Google also announced the addition of API resources corresponding to the Gemini artificial intelligence model on the Vertex AI platform, allowing agent services built with the Gemini artificial intelligence model to process voice, video and text content with lower latency, thereby achieving interactive effects such as real-time conversations and real-time monitoring that are closer to humans. It also supports conversations longer than 30 minutes, multilingual audio analysis, and the integration of more functions to handle more complex work tasks.



