Written By Deepti Ratnam
Published By: Deepti Ratnam | Published: Mar 04, 2026, 10:15 AM (IST)
Google has launched a new AI model termed Gemini 3.1 Flash Lite. This is the most affordable and fastest model according to the company in the Gemini 3 series. It is constructed to support developers who perform high workloads and require faster responses at a reduced price. The model is also in preview mode in the Gemini API in AI Studio and with enterprises based on Vertex AI.
Low pricing is one of the greatest attractions of Gemini 3.1 Flash Lite. The model is expensive in that it uses 0.25 per one million input tokens and 1.50 per one million output tokens. This renders it a cost-effective choice to developers working on large scale applications.
Google asserts that this model provides good performance without involving high costs. It is almost equal in quality and much cheaper than larger AI models. This pricing can be used to lower the total costs of operation in case of startups and companies that handle heavy AI tasks.
Another significant attribute of Gemini 3.1 Flash Lite is speed. Google says that the model is 2.5 times faster at generating first answer token than Gemini 2.5 Flash. It also demonstrates the 45 percent increase in the speed of output.
Such advancements imply that users will have faster response time in real-time interaction. Speed of output is relevant to chatbots, automation tools, and customer support systems where response time is a concern.
The model also ranked 1432 in the Arena.ai Leaderboard. It had a better performance compared to other models of the same category particularly in the form of reasoning and multimodal understanding activities. This indicates that the model would be able to deal with both text and complex inputs.
Gemini 3.1 Flash Lite has a useful feature known as the thinking levels. This is found in AI Studio and Vertex AI. It gives developers the ability to regulate the extent of reasoning power the model employs to accomplish various tasks.
This flexibility aids the developers in compromising speed and accuracy with their requirements. Other companies such as Latitude, Cartwheel and Whering are already experimenting with the model. The initial feedback is that it is capable of handling complicated instructions without losing accuracy and consistency.
Google is prioritising speed, low cost and scalable performance with Gemini 3.1 Flash Lite. The model is aimed at developers who do not need to spend a lot to achieve quality AI performance. With the growing use of AI technologies in the business sector, more affordable and quick models such as this one can significantly contribute to the large scale implementation of AI.