Merge Landing Page

Gemini 3 Pro:
Everything you need to know about the model

Gemini 3 Pro is a Google model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 1,048,576 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

Gemini 3 Pro pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Google | $2.00 | $12.00 | No |

Test Gemini 3 Pro with Merge Gateway’s Simulator

Gemini 3 Pro

Model

System prompt

Synced

User message

Synced

Response

Run simulation to see response

Cost

—

Tokens

—

Latency

—

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Start building for free

Get a demo

Route requests to Gemini 3 Pro with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to Gemini 3 Pro and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.

To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.

Install the Merge Gateway SDK

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Make your first API call

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Try a diffrent model

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Install the Merge Gateway SDK

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Make your first API call

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Try a diffrent model

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Install the Merge Gateway SDK

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Make your first API call

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Try a diffrent model

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Install the Merge Gateway SDK

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Make your first API call

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Try a diffrent model

Python

1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Explore other models available in Merge Gateway

Amazon Nova 2 Lite

Amazon.Nova 2 Sonic V1:0

Amazon Nova Lite (US Cross-Region)

Amazon Nova Micro (US Cross-Region)

Amazon Nova Premier

Amazon Nova Pro

arcee-ai/Trinity-Large-Thinking

ByteDance-Seed/UI-TARS-1.5-7B

Claude Opus 4

Claude Opus 4.6

Claude Opus 4.7

Claude Opus 4.8

Claude Sonnet 4 20250514

Claude Sonnet 4 6

Codestral

Codestral 2508

Computer Use Preview

deepseek-ai/DeepSeek-V3.1

DeepSeek-R1

DeepSeek R1 (0528)

DeepSeek V3

Deepseek V32

DeepSeek V3.2

DeepSeek V4 Flash

Gemini 3 Pro FAQ

If you have additional questions about Gemini 3 Pro, we've addressed several more below. Keep in mind that this information was written on 6/2/2026 and may change over time.

Heading

What other models does Google offer?

Google's Gemini lineup spans several tiers designed for different cost, latency, and capability tradeoffs. Here are some other models Google supports:

Gemini 2.5 Pro: Gemini 2.5 Pro is Google's previous-generation flagship, priced at $1.25 per million input tokens and $10.00 per million output tokens, supporting extended reasoning and a 1M token context window, and well suited for complex analytical tasks at a lower price point than Gemini 3 Pro

Gemini 3 Flash: Gemini 3 Flash is a preview-stage model from Google that pairs 1M token context with a lower price of $0.50 per million input tokens and $3.00 per million output tokens, making it the cost-efficient complement to Gemini 3 Pro for teams that need the newer generation at reduced spend

Gemini 2.5 Flash: Gemini 2.5 Flash is a stable mid-tier model priced at $0.30 per million input tokens, offering 203.2 tokens per second output speed and an above-average Intelligence Index score, positioned as a reliable general-purpose option before stepping up to the Gemini 3 series

Gemini 2.5 Flash Lite: Gemini 2.5 Flash Lite is the fastest and most affordable model in Google's current lineup, priced at $0.10 per million input tokens and $0.40 per million output tokens, with the highest output speed across all evaluated models and suited for high-volume, latency-sensitive workloads where cost is the primary constraint

How does Gemini 3 Pro differ from Google's other models?

Gemini 3 Pro sits at the premium end of Google's Gemini 3 series, designed for tasks that require the highest available reasoning depth and multimodal input support in the Gemini 3 generation.

Pricing: Input costs $2.00 per million tokens and output costs $12.00 per million tokens. That is 4x the input cost of Gemini 2.5 Pro at $1.25 and over 4x the output cost, placing Gemini 3 Pro at the top of Google's pricing tier

Context window: Supports 1M tokens of input context, matching Gemini 2.5 Pro and Gemini 3 Flash. This is sufficient for the vast majority of long-document and multi-turn workloads

Intelligence Index: Ranked #30 out of 150 models evaluated on the Artificial Analysis Intelligence Index, positioning it among the top tier of all publicly available models across providers

Speed: Output speed figures were not yet available at the time of writing. Gemini 3 Pro is a reasoning-class model, and reasoning models typically trade lower throughput for higher answer quality

Capabilities: Accepts text, image, speech, and video inputs. As a proprietary reasoning model, it generates extended chain-of-thought before producing answers, which is a capability not present in Gemini 2.5 Flash or Gemini 3 Flash in their non-reasoning modes

Gemini 3 Pro is the right choice for workflows where answer accuracy and reasoning depth matter more than cost or raw throughput, such as scientific analysis, complex code review, or multi-step agentic tasks.

What models should I consider using alongside Gemini 3 Pro?

No single model is optimal for every task. Here are models worth pairing with Gemini 3 Pro depending on what your product needs:

Gemini 3 Flash: For requests that require the Gemini 3 generation's capabilities but don't need full reasoning depth, route to Gemini 3 Flash at $0.50 per million input tokens to capture most of the generation's quality gains at a fraction of the cost

Gemini 2.5 Flash Lite: For bulk preprocessing, classification, or extraction tasks feeding into a Gemini 3 Pro reasoning step, Gemini 2.5 Flash Lite at $0.10 per million input tokens handles the high-volume groundwork so Gemini 3 Pro only processes what requires deep analysis

Claude Opus 4 (Anthropic): For creative long-form generation or nuanced instruction-following tasks where a cross-provider reasoning model comparison is warranted, Claude Opus 4 offers an alternative flagship-tier benchmark to validate which model performs better on your specific eval

o3 (OpenAI): For mathematical reasoning, competitive programming, and STEM benchmarks where a second top-tier reasoning model adds coverage, o3 from OpenAI serves as a high-quality fallback or A/B comparison partner for Gemini 3 Pro on reasoning-heavy workloads

Llama 3.3 70B (Meta): For teams with data residency constraints that require on-premises or self-hosted deployment, Llama 3.3 70B provides an open-weight alternative for instruction-following tasks where a cloud API is not acceptable

What are the challenges of using Gemini 3 Pro in my product?

Like any production LLM, Gemini 3 Pro comes with tradeoffs worth planning for:

Cost at scale: At $2.00 per million input tokens and $12.00 per million output tokens, Gemini 3 Pro is one of the more expensive models available. At meaningful request volumes, token costs compound quickly, and a single workload shift to Gemini 3 Pro without cost controls can significantly overshoot budget

Provider dependency: Routing all traffic to Google's API means any quota restriction, regional outage, or model deprecation directly disrupts every workload relying on Gemini 3 Pro. Preview-stage models like this one carry additional deprecation risk as newer versions (such as Gemini 3.1 Pro) are released

Preview status and stability: Gemini 3 Pro is designated as a preview model, which means API behavior, rate limits, and availability guarantees may differ from stable production models. Teams building latency-critical features should account for potential variability

Latency for interactive use cases: Reasoning models generate extended chain-of-thought before producing a final answer, which adds wall-clock latency per request. Real-time applications like chatbots or autocomplete features are poor fits without prompt-level controls to limit thinking depth

Output verbosity and token cost interaction: Gemini 3 Pro generated 56M output tokens during the Artificial Analysis Intelligence Index evaluation, suggesting it can produce lengthy responses. At $12.00 per million output tokens, verbose completions make effective cost per task higher than headline pricing implies

Why should I use Merge Gateway to route LLM requests with Gemini 3 Pro and every other model?

Using Gemini 3 Pro through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

One API, every provider: Access Gemini 3 Pro and every other major LLM through a single endpoint and API key. Change providers by swapping the model string, with no application code changes required

Intelligent routing and automatic failover: Merge routes around Google outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40-60% without touching your application code, which matters significantly given Gemini 3 Pro's output pricing

Cost governance: Set hard or soft project budgets so Gemini 3 Pro spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers

Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision

Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches Google. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with Gemini 3 Pro?

Getting Gemini 3 Pro running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For Gemini 3 Pro, the model string is google/gemini-3-pro. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming Gemini 3 Pro as primary with one fallback.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Gemini 3 Pro through Merge Gateway

Route, observe, and control AI requests across providers from one API.

Start building for free

Get a demo