Mar 9, 2026
You can now select Google's latest small model to run your agents at a fraction of the cost of larger models. It's noticeably faster than 2.5 Flash and priced at $0.25/1M input tokens and $1.50/1M output tokens.
It also comes with built-in thinking levels, letting you dial up reasoning depth when a task requires it, without paying for it when it doesn't.
Best for:
High-volume document processing and classification
Real-time content moderation and extraction
Workflows where speed and cost matter at scale
Gemini 3.1 Flash-Lite scores 86.9% on GPQA Diamond, a reasoning benchmark, which puts it ahead of some of Google's larger models from the previous generation.

Andrea Azzini
Head of Product


