Google Gemini API Pricing June 2026 — Official Per-Million-Token Rates, Free Tier & Calculator

Google Gemini API Pricing June 2026 — Official Per-Million-Token Rates, Free Tier & Calculator

(Updated: ) 📖 5 min read

Google’s Gemini family has expanded significantly in 2026 with the launch of the Gemini 3.5 Flash (May 19). Whether you’re building a chatbot, processing millions of documents, or creating the next AI-powered app, understanding the pricing is critical to keeping your costs under control.

💡 Calculate your exact costs instantly: Use our AI API Pricing Calculator to compare Gemini against OpenAI, Grok, and Claude for your specific workload — including images, audio, and video.


Quick Decision Guide: Which Gemini Model to Pick

If you need… Use this model Cost (Input/Output per 1M)
Maximum intelligence Gemini 3.1 Pro $2.00 / $12.00
New flagship balanced Gemini 3.5 Flash $1.50 / $9.00
Best all-around value Gemini 3 Flash $0.50 / $3.00
Ultra-cheap processing Gemini 3.1 Flash-Lite $0.25 / $1.50
Absolute minimum cost Gemini 2.5 Flash-Lite $0.10 / $0.40

The Gemini Model Lineup at a Glance

Think of the Gemini family as a car dealership — each tier serves a different driver:

Tier Analogy Best For
Gemini 3.5 Flash New sports sedan Latest capabilities, balanced performance
Gemini 3.1 Pro Luxury sports car Complex reasoning, advanced coding, research
Gemini 3 Flash Reliable daily driver General-purpose apps, chatbots, summarization
Gemini 3.1 Flash-Lite Ultra-efficient compact High-volume batch processing, classification
Gemini 2.5 Pro Previous-gen flagship Legacy workloads, proven reliability
Gemini 2.5 Flash Budget all-rounder Cost-conscious production apps
Gemini 2.5 Flash-Lite Micro car Maximum scale at minimum cost

Complete Pricing Breakdown (Per 1 Million Tokens)

🆕 Gemini 3.5 Flash — The New Contender (May 19, 2026)

Cost Type Price per 1M Tokens
Input $1.50
Output $9.00
Cached Input ~$0.375

The latest addition to the Gemini lineup, offering enhanced capabilities over standard Flash at a mid-range price point.

Gemini 3.1 Pro — The Flagship Powerhouse

Best for: Complex coding tasks, multi-step reasoning, advanced research, agentic workflows with 1M token context.

Cost Type Standard (≤200K context) Long Context (>200K)
Input $2.00 $4.00
Output $12.00 $24.00

Pro tip: Gemini 3.1 Pro doubles in cost when your prompt exceeds 200,000 tokens. Keep prompts concise or use context caching to avoid the premium.

Gemini 3 Flash — The Smart All-Rounder

Cost Type Price per 1M Tokens
Input (text/image/video) $0.50
Output $3.00

Flat pricing — no long-context surcharge. This makes Flash ideal for applications with variable prompt lengths.

Gemini 3.1 Flash-Lite — The Budget Champion

Cost Type Price per 1M Tokens
Input (text/image/video) $0.25
Output $1.50

At just $0.25 per million input tokens, Flash-Lite is one of the cheapest production-grade AI models available anywhere.

Legacy Models (Still Available)

Model Input (per 1M) Output (per 1M) Notes
Gemini 2.5 Pro $1.25 $10.00 2x cost for >200K context
Gemini 2.5 Flash $0.30 $2.50 Flat pricing
Gemini 2.5 Flash-Lite $0.10 $0.40 Cheapest option available

Gemini 2.5 Flash-Lite at $0.10/M input remains the absolute cheapest model in Google’s lineup — perfect for ultra-high-volume workloads.


Multimodal Token Consumption

Gemini natively supports images, audio, and video. Here’s how tokens are calculated:

Modality Token Rate Example
Images 258 tokens per tile (768×768px) 5 standard photos = ~1,290 tokens
Audio 32 tokens per second 1 minute = ~1,920 tokens
Video 263 tokens per second 1 minute = ~15,780 tokens

🧮 Calculate multimodal costs: Our pricing calculator handles images, audio, and video token estimation automatically.


Cost Optimization Strategies

1. Context Caching — Save Up to 90%

Cache frequently used system prompts, large documents, or reference materials. Cached tokens cost as little as 10% of the standard input price.

2. Batch API — Save 50%

For non-urgent workloads (data processing, nightly reports), the Batch API cuts costs by 50% with 24-hour turnaround.

3. Free Tier in Google AI Studio

Flash and Flash-Lite models offer a generous free tier for prototyping — perfect for testing before committing to paid usage.


Real-World Cost Comparison: 10 Use Cases

Use Case Gemini 3.1 Pro Gemini 3 Flash Gemini Flash-Lite Gemini 2.5 Flash-Lite
Summarize 100K-word doc ~$0.28 ~$0.07 ~$0.04 ~$0.01
10K chatbot msgs/day (monthly) $960 $240 $120 $48
1K image analyses/day (monthly) $18 $4.50 $2.25 $0.90
100 audio transcripts/day 5min each (monthly) $58 $14 $7 N/A
Code review 500 PRs/day (monthly) $1,440 $360 $180 $72

How Gemini Compares to Other Providers

Model Input/1M Output/1M Context Provider
Gemini 2.5 Flash-Lite $0.10 $0.40 1M Google
GPT-4.1 Nano $0.10 $0.40 1M OpenAI
Grok 4.1 Fast $0.20 $0.50 2M xAI
Gemini 3 Flash $0.50 $3.00 1M Google
Grok 4.3 $1.25 $2.50 1M xAI
GPT-4.1 $2.00 $8.00 1M OpenAI
Claude Sonnet 4.6 $3.00 $15.00 1M Anthropic

Key Takeaways

  1. Gemini 3.5 Flash (NEW) adds a mid-tier option between Flash and Pro
  2. Gemini 3 Flash is the sweet spot for most production apps at $0.50/1M
  3. Gemini 2.5 Flash-Lite at $0.10/1M is the cheapest production model from any major provider
  4. Native multimodal support means images, audio, and video are processed as tokens — no separate API needed
  5. Always use context caching for repeated prompts to slash costs by up to 90%
  6. Free tier is available for prototyping — start building at zero cost

Ready to Build?

Head over to Google AI Studio to experiment with all these models for free, or check the official pricing page for the latest rates.


Prices are current as of May 2026. Always verify with Google’s official documentation before production deployment.

🧮
Calculate Your AI API Costs Compare 30+ models instantly — Gemini, OpenAI, Grok & Claude
Open Calculator →
EXCEL / SHEETS TEMPLATE

Download the 2026 AI API Cost Optimization Spreadsheet

A complete, ready-to-use template to model, calculate, and project your API bills for Gemini, OpenAI, Grok, and Claude.

Professor XAI
Professor XAI ML Engineer passionate about advancing AI technologies and building intelligent systems.
comments powered by Disqus