GPT-5 Nano: pricing, performance, and how to route requests

GPT-5 Nano:
pricing, performance, and how to route requests

GPT-5 Nano is accessible via Merge Gateway. With Gateway, you can apply routing policies and spend controls, and access per-request logs. Context window and streaming support depend on the provider route you select.

GPT-5 Nano performance*

Intelligence - general reasoning and knowledge

27%

Coding - code generation and problem-solving

20%

*Performance data is provided by Artificial Analysis and is subject to change.

GPT-5 Nano pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | OpenAI | $0.0500 | $0.4000 | Yes |

Test GPT-5 Nano with Merge Gateway’s Simulator

GPT-5 Nano

Model

System prompt

Synced

User message

Synced

Response

Run simulation to see response

Cost

—

Tokens

—

Latency

—

Route requests to GPT-5 Nano with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to GPT-5 Nano and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, data loss prevention (DLP), prompt injection protection, and observability without changing your application architecture.

To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.

Install the Merge Gateway SDK

Python

1$ pip install merge-gateway-sdk

Send a request

Python

1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)

Try a diffrent model

Swap the model string to route to a different provider. No other code changes needed.

Anthropic

1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)

Point to Gateway

Python

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)

Send a request

Use the standard chat.completions.create method. No provider prefix needed on the model name.

Python

1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)

Install packages

1npm install merge-gateway-ai-sdk-provider ai

Create the provider

TypeScript

1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});

Send a request

Use generateText to send a request. Model names use the provider/model format.

TypeScript

1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);

If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:

TypeScript

1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged

Install the Merge Gateway SDK

Anthropic SDK

1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

Amazon Nova 2 Lite

Amazon Nova 2 Sonic

Amazon Nova Premier

Amazon Nova Pro

Claude Opus 4.6

Claude Opus 4.7

Claude Opus 4.8

Claude Sonnet 4.5

Claude Sonnet 4.6

Codestral

Codestral 25.08

DeepSeek V3

DeepSeek V3.2

DeepSeek V4 Flash

DeepSeek V4 Pro

Devstral 2512

Dola Seed 2.0 Code (preview)

Dola Seed 2.0 Lite

Dola Seed 2.0 Mini

Dola Seed 2.0 Pro

Gemini 2.5 Flash

Gemini 2.5 Flash Lite

Gemini 2.5 Pro

Gemini 3.1 Flash Lite

GPT-5 Nano FAQ

In case you have any other questions on GPT-5 Nano, we've answered a few more below. This information was written in June, 2026 and is subject to change.

Heading

What other models does OpenAI offer?

OpenAI maintains a tiered lineup spanning cost-optimized, general-purpose, and frontier reasoning models. Here are some other models OpenAI supports:

GPT-4.1 Nano: GPT-4.1 Nano is OpenAI's most cost-efficient general-purpose model at $0.10 input and $0.40 output per million tokens. It delivers fast responses with a 1M token context window, making it a strong alternative for workloads where raw intelligence is less critical than throughput and cost

GPT-4o Mini: GPT-4o Mini is a fast, cost-efficient multimodal model at $0.15 input and $0.60 output per million tokens. It is well-suited for real-time interfaces and high-volume tasks that need text and image understanding

GPT-5 Mini: GPT-5 Mini is the next tier up from GPT-5 Nano in the GPT-5 reasoning series, priced at $0.25 input and $2.00 output per million tokens. It scores higher on the Intelligence Index and is the right step up when GPT-5 Nano's capability falls short

GPT-5: GPT-5 is OpenAI's mid-range GPT-5 series model, offering higher intelligence than the Nano and Mini variants with a 400k token context window

GPT-5.4: GPT-5.4 is a higher-capability GPT-5 series model positioned above GPT-5, with stronger benchmark scores and a larger maximum output at $2.50 input and $15.00 output per million tokens

How does GPT-5 Nano differ from OpenAI's other models?

GPT-5 Nano is OpenAI's entry-level reasoning model in the GPT-5 series, built for cost-sensitive workloads that still benefit from chain-of-thought capability.

Pricing: GPT-5 Nano is priced at $0.05 per million input tokens and $0.40 per million output tokens according to Artificial Analysis data. Note that GPT-5 Nano does not appear on OpenAI's official API pricing page; verify current pricing and availability directly with OpenAI before building production workflows on it

Intelligence ranking: GPT-5 Nano scores 27 on the Artificial Analysis Intelligence Index, placing it #24 out of 216 evaluated models. This is the lowest score in the GPT-5 reasoning series but well above non-reasoning cost-tier alternatives

Speed: Output speed is 150.7 tokens per second, placing it among the faster models overall. However, time to first token is 106.96 seconds, making it entirely unsuitable for interactive or streaming interfaces

Context window: GPT-5 Nano supports 400,000 input tokens, matching GPT-5 Mini but smaller than GPT-4.1 Nano's 1M token window

Knowledge cutoff: GPT-5 Nano has a knowledge cutoff of May 2024, which is older than the August 2025 cutoff on GPT-5.4 Nano and GPT-5.4 Mini

Capabilities: Accepts text and image inputs, produces text output. Reasoning model with chain-of-thought processing

GPT-5 Nano is best suited for async or batch processing workloads where cost per token is the primary constraint, latency is not a concern, and the task complexity benefits from reasoning capability.

What models should I consider using alongside GPT-5 Nano?

No single model is optimal for every task. Here are models worth pairing with GPT-5 Nano depending on what your product needs:

GPT-4.1 Nano: For tasks that don't require reasoning traces but need fast responses with a larger context window, GPT-4.1 Nano at $0.10 input offers a 1M token context and near-instant TTFT. Use it when GPT-5 Nano's latency is a blocker

GPT-5 Mini: When GPT-5 Nano's intelligence score falls short, route higher-complexity tasks to GPT-5 Mini at $0.25 input per million tokens for a meaningful benchmark improvement while staying in the low-cost tier

GPT-5.4 Nano: For workloads that need stronger capability at a still-affordable price, GPT-5.4 Nano at $0.20 input scores 44 on the Intelligence Index versus GPT-5 Nano's 27, with a much faster TTFT of 6.33 seconds and a more recent knowledge cutoff

Gemini 2.5 Flash Lite: For latency-sensitive volume tasks that GPT-5 Nano's 107-second TTFT cannot serve, Gemini 2.5 Flash Lite provides very high output speed at a comparable cost tier

Llama 4 Scout: For batch workloads where open-weight models are acceptable, Llama 4 Scout provides strong intelligence at low cost with a 10M token context window

What are the challenges of using GPT-5 Nano in my product?

Like any production LLM, GPT-5 Nano comes with tradeoffs worth planning for:

Not on OpenAI's official pricing page: GPT-5 Nano does not appear in OpenAI's published pricing tables. This creates uncertainty around official support, pricing stability, and long-term availability. Do not build critical production workflows on this model without confirming its status with OpenAI directly

Extremely high latency: A time to first token of 106.96 seconds is one of the highest in the dataset. GPT-5 Nano cannot be used for any user-facing, real-time, or streaming interface. It is only viable for async and batch processing pipelines

Oldest knowledge cutoff in the series: A May 2024 knowledge cutoff means GPT-5 Nano lacks awareness of events and data from the past two years. For knowledge-dependent tasks, this is a meaningful gap relative to GPT-5.4 Nano's August 2025 cutoff

Provider dependency: Running on a single OpenAI model creates fragility from outages and rate limits, especially for a model without confirmed official pricing

Cost at scale: At $0.40 per million output tokens with 110M tokens generated during evaluation, output costs can accumulate faster than the input pricing suggests

Why should I use Merge Gateway to route LLM requests with GPT-5 Nano and every other model?

Using GPT-5 Nano through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

One API, every provider: Access GPT-5 Nano and every other major LLM through a single endpoint and API key. If OpenAI's pricing or availability status changes, switch to an alternative by updating the model string with no code changes

Intelligent routing and automatic failover: Merge routes around OpenAI outages automatically. Given GPT-5 Nano's uncertain official status, having automatic fallback routing to GPT-5.4 Nano or GPT-4.1 Nano is especially valuable

Cost governance: Set hard or soft project budgets so GPT-5 Nano spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers

Build Your Own Router: Define benchmark thresholds at which GPT-5 Nano is sufficient versus when to escalate to a higher-capability model. The router picks the winner per request with a plain-language explanation of every decision

Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches OpenAI. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with GPT-5 Nano?

Getting GPT-5 Nano running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For GPT-5 Nano, the model string is openai/gpt-5-nano (confirm the exact model ID with OpenAI's API documentation). Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Given GPT-5 Nano's uncertain availability status, configuring GPT-5.4 Nano or GPT-4.1 Nano as an automatic fallback is strongly recommended.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try GPT-5 Nano through Merge Gateway

Route, observe, and control AI requests across providers from one API.

Start building for free

Get a demo