Gemini 2.5 Flash-Lite: pricing, performance, and how to route requests

Route requests to
Gemini 2.5 Flash-Lite
with Merge Gateway

Apply your own routing policies, reduce token costs automatically, and see every routing decision in real time with Merge Gateway.

What Gemini 2.5 Flash-Lite costs to run

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Vertex AI | $0.1000 | $0.4000 | Yes |

Test Gemini 2.5 Flash-Lite
with Gateway’s Simulator

See a prompt's output, token spend, latency, and more with Gemini 2.5 Flash-Lite.

Route requests to Gemini 2.5 Flash-Lite in minutes

To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.

Install the Merge Gateway SDK

Python

1$ pip install merge-gateway-sdk

Send a request

Python

1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)

Try a diffrent model

Swap the model string to route to a different provider. No other code changes needed.

Anthropic

1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)

Point to Gateway

Python

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)

Send a request

Use the standard chat.completions.create method. No provider prefix needed on the model name.

Python

1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)

Install packages

1npm install merge-gateway-ai-sdk-provider ai

Create the provider

TypeScript

1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});

Send a request

Use generateText to send a request. Model names use the provider/model format.

TypeScript

1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);

If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:

TypeScript

1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged

Install the Merge Gateway SDK

Anthropic SDK

1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

GLM 4.7 Flash

GLM-4.7 FlashX

GLM-5

GLM 5.1

GLM-5.1

GLM-5.2

GLM-5 Turbo

GLM-5-Turbo

GPT-3.5 Turbo

GPT-3.5 Turbo (0125)

GPT-3.5 Turbo 1106

GPT-3.5 Turbo 16K

GPT-4

GPT-4 0613

GPT-4.1

GPT-4.1 (2025-04-14)

GPT-4.1 Mini

Gemini 2.5 Flash-Lite FAQ

Heading

What provider owns Gemini 2.5 Flash-Lite?

Gemini 2.5 Flash-Lite is a Google model.

Which vendors can run Gemini 2.5 Flash-Lite?

Vertex AI is the default listed vendor, and other active vendors may also be available.

What context window does Gemini 2.5 Flash-Lite support?

Gemini 2.5 Flash-Lite supports 1,000,000 tokens on the primary listed vendor route.

What capabilities does Gemini 2.5 Flash-Lite support?

Gateway currently lists streaming, structured outputs, tool calling, vision support for Gemini 2.5 Flash-Lite across its available vendor routes.

Try Gemini 2.5 Flash-Lite through Merge Gateway

Route, observe, and control AI requests across providers from one API.

Start building for free

Get a demo

What Gemini 2.5 Flash-Lite costs to run

Test Gemini 2.5 Flash-Litewith Gateway’s Simulator

See a prompt's output, token spend, latency, and more with Gemini 2.5 Flash-Lite.

Route requests to Gemini 2.5 Flash-Lite in minutes

Explore other models available in Merge Gateway

Gemini 2.5 Flash-Lite FAQ

Heading

What provider owns Gemini 2.5 Flash-Lite?

Which vendors can run Gemini 2.5 Flash-Lite?

What context window does Gemini 2.5 Flash-Lite support?

What capabilities does Gemini 2.5 Flash-Lite support?

Try Gemini 2.5 Flash-Lite through Merge Gateway

Test Gemini 2.5 Flash-Lite
with Gateway’s Simulator