Gemini 3 Pro:
Everything you need to know about the model

Gemini 3 Pro is a Google model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 1,048,576 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

Gemini 3 Pro pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Google | $2.00 | $12.00 | No |

Test Gemini 3 Pro with Merge Gateway’s Simulator

Gemini 3 Pro
Synced
Synced
Run simulation to see response

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Route requests to Gemini 3 Pro with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to Gemini 3 Pro and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.
To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Explore other models available in Merge Gateway

model logo
Amazon Nova 2 Lite
model logo
Amazon.Nova 2 Sonic V1:0
model logo
Amazon Nova Lite (US Cross-Region)
model logo
Amazon Nova Micro (US Cross-Region)
model logo
Amazon Nova Premier
model logo
Amazon Nova Pro
model logo
arcee-ai/Trinity-Large-Thinking
model logo
ByteDance-Seed/UI-TARS-1.5-7B
model logo
Claude Opus 4
model logo
Claude Opus 4.6
model logo
Claude Opus 4.7
model logo
Claude Opus 4.8
model logo
Claude Sonnet 4 20250514
model logo
Claude Sonnet 4 6
model logo
Codestral
model logo
Codestral 2508
model logo
Computer Use Preview
model logo
deepseek-ai/DeepSeek-V3.1
model logo
DeepSeek-R1
model logo
DeepSeek R1 (0528)
model logo
DeepSeek V3
model logo
Deepseek V32
model logo
DeepSeek V3.2
model logo
DeepSeek V4 Flash

Gemini 3 Pro FAQ

If you have additional questions about Gemini 3 Pro, we've addressed several more below. Keep in mind that this information was written on 6/2/2026 and may change over time.

Heading

What other models does Google offer?

Google's Gemini lineup spans several tiers designed for different cost, latency, and capability tradeoffs. Here are some other models Google supports:

  • Gemini 2.5 Pro: Gemini 2.5 Pro is Google's previous-generation flagship, priced at $1.25 per million input tokens and $10.00 per million output tokens, supporting extended reasoning and a 1M token context window, and well suited for complex analytical tasks at a lower price point than Gemini 3 Pro
  • Gemini 3 Flash: Gemini 3 Flash is a preview-stage model from Google that pairs 1M token context with a lower price of $0.50 per million input tokens and $3.00 per million output tokens, making it the cost-efficient complement to Gemini 3 Pro for teams that need the newer generation at reduced spend
  • Gemini 2.5 Flash: Gemini 2.5 Flash is a stable mid-tier model priced at $0.30 per million input tokens, offering 203.2 tokens per second output speed and an above-average Intelligence Index score, positioned as a reliable general-purpose option before stepping up to the Gemini 3 series
  • Gemini 2.5 Flash Lite: Gemini 2.5 Flash Lite is the fastest and most affordable model in Google's current lineup, priced at $0.10 per million input tokens and $0.40 per million output tokens, with the highest output speed across all evaluated models and suited for high-volume, latency-sensitive workloads where cost is the primary constraint

How does Gemini 3 Pro differ from Google's other models?

Gemini 3 Pro sits at the premium end of Google's Gemini 3 series, designed for tasks that require the highest available reasoning depth and multimodal input support in the Gemini 3 generation.

  • Pricing: Input costs $2.00 per million tokens and output costs $12.00 per million tokens. That is 4x the input cost of Gemini 2.5 Pro at $1.25 and over 4x the output cost, placing Gemini 3 Pro at the top of Google's pricing tier
  • Context window: Supports 1M tokens of input context, matching Gemini 2.5 Pro and Gemini 3 Flash. This is sufficient for the vast majority of long-document and multi-turn workloads
  • Intelligence Index: Ranked #30 out of 150 models evaluated on the Artificial Analysis Intelligence Index, positioning it among the top tier of all publicly available models across providers
  • Speed: Output speed figures were not yet available at the time of writing. Gemini 3 Pro is a reasoning-class model, and reasoning models typically trade lower throughput for higher answer quality
  • Capabilities: Accepts text, image, speech, and video inputs. As a proprietary reasoning model, it generates extended chain-of-thought before producing answers, which is a capability not present in Gemini 2.5 Flash or Gemini 3 Flash in their non-reasoning modes

Gemini 3 Pro is the right choice for workflows where answer accuracy and reasoning depth matter more than cost or raw throughput, such as scientific analysis, complex code review, or multi-step agentic tasks.

What models should I consider using alongside Gemini 3 Pro?

No single model is optimal for every task. Here are models worth pairing with Gemini 3 Pro depending on what your product needs:

  • Gemini 3 Flash: For requests that require the Gemini 3 generation's capabilities but don't need full reasoning depth, route to Gemini 3 Flash at $0.50 per million input tokens to capture most of the generation's quality gains at a fraction of the cost
  • Gemini 2.5 Flash Lite: For bulk preprocessing, classification, or extraction tasks feeding into a Gemini 3 Pro reasoning step, Gemini 2.5 Flash Lite at $0.10 per million input tokens handles the high-volume groundwork so Gemini 3 Pro only processes what requires deep analysis
  • Claude Opus 4 (Anthropic): For creative long-form generation or nuanced instruction-following tasks where a cross-provider reasoning model comparison is warranted, Claude Opus 4 offers an alternative flagship-tier benchmark to validate which model performs better on your specific eval
  • o3 (OpenAI): For mathematical reasoning, competitive programming, and STEM benchmarks where a second top-tier reasoning model adds coverage, o3 from OpenAI serves as a high-quality fallback or A/B comparison partner for Gemini 3 Pro on reasoning-heavy workloads
  • Llama 3.3 70B (Meta): For teams with data residency constraints that require on-premises or self-hosted deployment, Llama 3.3 70B provides an open-weight alternative for instruction-following tasks where a cloud API is not acceptable

What are the challenges of using Gemini 3 Pro in my product?

Like any production LLM, Gemini 3 Pro comes with tradeoffs worth planning for:

  • Cost at scale: At $2.00 per million input tokens and $12.00 per million output tokens, Gemini 3 Pro is one of the more expensive models available. At meaningful request volumes, token costs compound quickly, and a single workload shift to Gemini 3 Pro without cost controls can significantly overshoot budget
  • Provider dependency: Routing all traffic to Google's API means any quota restriction, regional outage, or model deprecation directly disrupts every workload relying on Gemini 3 Pro. Preview-stage models like this one carry additional deprecation risk as newer versions (such as Gemini 3.1 Pro) are released
  • Preview status and stability: Gemini 3 Pro is designated as a preview model, which means API behavior, rate limits, and availability guarantees may differ from stable production models. Teams building latency-critical features should account for potential variability
  • Latency for interactive use cases: Reasoning models generate extended chain-of-thought before producing a final answer, which adds wall-clock latency per request. Real-time applications like chatbots or autocomplete features are poor fits without prompt-level controls to limit thinking depth
  • Output verbosity and token cost interaction: Gemini 3 Pro generated 56M output tokens during the Artificial Analysis Intelligence Index evaluation, suggesting it can produce lengthy responses. At $12.00 per million output tokens, verbose completions make effective cost per task higher than headline pricing implies

Why should I use Merge Gateway to route LLM requests with Gemini 3 Pro and every other model?

Using Gemini 3 Pro through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

  • One API, every provider: Access Gemini 3 Pro and every other major LLM through a single endpoint and API key. Change providers by swapping the model string, with no application code changes required
  • Intelligent routing and automatic failover: Merge routes around Google outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40-60% without touching your application code, which matters significantly given Gemini 3 Pro's output pricing
  • Cost governance: Set hard or soft project budgets so Gemini 3 Pro spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers
  • Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision
  • Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches Google. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with Gemini 3 Pro?

Getting Gemini 3 Pro running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For Gemini 3 Pro, the model string is google/gemini-3-pro. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming Gemini 3 Pro as primary with one fallback.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Gemini 3 Pro through Merge Gateway

Route, observe, and control AI requests across providers from one API.