Gemini 2.5 Flash: pricing, performance, and how to route requests

Route requests to
Gemini 2.5 Flash
with Merge Gateway

Apply your own routing policies, reduce token costs automatically, and see every routing decision in real time with Merge Gateway.

What Gemini 2.5 Flash costs to run

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Google | $0.3000 | $2.50 | No | | Vertex AI | $0.3000 | $2.50 | Yes |

Test Gemini 2.5 Flash
with Gateway’s Simulator

See a prompt's output, token spend, latency, and more with Gemini 2.5 Flash.

Route requests to Gemini 2.5 Flash in minutes

To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.

Install the Merge Gateway SDK

Python

1$ pip install merge-gateway-sdk

Send a request

Python

1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)

Try a diffrent model

Swap the model string to route to a different provider. No other code changes needed.

Anthropic

1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)

Point to Gateway

Python

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)

Send a request

Use the standard chat.completions.create method. No provider prefix needed on the model name.

Python

1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)

Install packages

1npm install merge-gateway-ai-sdk-provider ai

Create the provider

TypeScript

1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});

Send a request

Use generateText to send a request. Model names use the provider/model format.

TypeScript

1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);

If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:

TypeScript

1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged

Install the Merge Gateway SDK

Anthropic SDK

1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

Gemma 3 12B

Gemma 3 27B

Gemma 3 27B IT

Gemma 3 4B

Gemma 4 26B-A4B

Gemma 4 31B IT

GLM-4 32B 0414 128K

GLM-4.5

GLM-4.5 Air

GLM-4.5-Air

GLM-4.5 AirX

GLM-4.5-AirX

Glm 4.5V

GLM-4.5V

GLM-4.5 X

GLM-4.6

GLM-4.7

GLM 4.7 Flash

Gemini 2.5 Flash FAQ

Heading

What provider owns Gemini 2.5 Flash?

Gemini 2.5 Flash is a Google model.

Which vendors can run Gemini 2.5 Flash?

Google is the default listed vendor, and other active vendors may also be available.

What context window does Gemini 2.5 Flash support?

Gemini 2.5 Flash supports 1,048,576 tokens on the primary listed vendor route.

What capabilities does Gemini 2.5 Flash support?

Gateway currently lists streaming, structured outputs, tool calling, vision support for Gemini 2.5 Flash across its available vendor routes.

Try Gemini 2.5 Flash through Merge Gateway

Route, observe, and control AI requests across providers from one API.

Start building for free

Get a demo

What Gemini 2.5 Flash costs to run

Test Gemini 2.5 Flashwith Gateway’s Simulator

See a prompt's output, token spend, latency, and more with Gemini 2.5 Flash.

Route requests to Gemini 2.5 Flash in minutes

Explore other models available in Merge Gateway

Gemini 2.5 Flash FAQ

Heading

What provider owns Gemini 2.5 Flash?

Which vendors can run Gemini 2.5 Flash?

What context window does Gemini 2.5 Flash support?

What capabilities does Gemini 2.5 Flash support?

Try Gemini 2.5 Flash through Merge Gateway

Test Gemini 2.5 Flash
with Gateway’s Simulator