GPT-5.4 Nano:
Everything you need to know about the model

GPT-5.4 Nano is a OpenAI model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 272,000 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

GPT-5.4 Nano pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | OpenAI | $0.2000 | $1.25 | Yes |

Test GPT-5.4 Nano with Merge Gateway’s Simulator

GPT-5.4 Nano
Synced
Synced
Run simulation to see response

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Route requests to GPT-5.4 Nano with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to GPT-5.4 Nano and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.
To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Explore other models available in Merge Gateway

model logo
GPT-5.2
model logo
GPT-5.4
model logo
Gpt 5.4 Mini
model logo
GPT-5.5
model logo
GPT-5 Mini
model logo
Gpt 5 Nano
model logo
Grok 3
model logo
Grok 3 Mini
model logo
Grok 4 0709
model logo
Grok 4 1 Fast Non Reasoning
model logo
Grok 4 1 Fast Reasoning
model logo
Grok 4.20
model logo
Grok 4.3
model logo
Grok 4 Fast Non Reasoning
model logo
Grok 4 Fast Reasoning
model logo
Grok Code Fast 1
model logo
Jamba 1.5 Large
model logo
Jamba 1.5 Mini
model logo
Kimi K2 0711 Preview
model logo
Kimi K2 0905 Preview
model logo
Kimi K2.5
model logo
Kimi K2.6
model logo
Kimi K2 Thinking
model logo
Kimi K2 Thinking

GPT-5.4 Nano FAQ

In case you have any other questions on GPT-5.4 Nano, we've answered a few more below. It's worth noting that the information below was written in June, 2026 and is subject to change.

Heading

What other models does OpenAI offer?

OpenAI's lineup spans open-weight models, affordable non-reasoning options, mid-tier reasoning systems, and high-capability flagship models for coding and research tasks. Here are some other models OpenAI supports:

  • GPT-4.1 Nano: A non-reasoning model with a 1M-token context window priced at $0.10 per 1M input tokens, optimized for maximum throughput on classification and extraction tasks where reasoning depth is not required
  • GPT-5 mini: A mid-tier reasoning model with extended thinking, a 400k-token context window, and an Intelligence Index score of 41, suited for deliberate multi-step reasoning at a cost below flagship models
  • GPT-5.4 Mini: A step up from GPT-5.4 Nano within the same model family, scoring 49 on the Intelligence Index with a 400k-token context window and output speed of 166.4 tokens per second, at $0.75 per 1M input tokens
  • GPT-5.3: OpenAI's code-focused reasoning model, ranking 12th on the Intelligence Index with a score of 54, designed for complex code generation and agentic software engineering tasks at a premium cost of $1.75 per 1M input tokens
  • GPT-OSS 20B: An Apache 2.0-licensed open-weight reasoning model with a Mixture of Experts architecture, priced at $0.05 per 1M input tokens with a 131k-token context window and 225 tokens per second output speed
  • GPT-OSS 120B: The larger open-weight model at 117B total parameters, scoring 33 on the Intelligence Index with 358.5 tokens per second throughput, suited for latency-sensitive open-infrastructure deployments

How does GPT-5.4 Nano differ from OpenAI's other models?

GPT-5.4 Nano is OpenAI's entry-level reasoning model in the GPT-5.4 family, designed to deliver above-average intelligence at the lowest cost point in the reasoning tier.

  • Intelligence Index (as of 6/2/2026): GPT-5.4 Nano scores 44 out of 162 models, ranking 9th overall. This places it above GPT-4.1 Nano (13) and GPT-5 mini (41) while trailing GPT-5.4 Mini (49) and GPT-5.3 (54)
  • Pricing: At $0.20 per 1M input tokens and $1.25 per 1M output tokens, GPT-5.4 Nano costs twice as much as GPT-4.1 Nano on input but delivers reasoning capability that GPT-4.1 Nano cannot provide. Compared to GPT-5.4 Mini ($0.75 input / $4.50 output), it costs roughly one-quarter on input for only a modest intelligence difference
  • Speed: GPT-5.4 Nano outputs 150.9 tokens per second with a time to first token of 5.54 seconds, which is on the higher end for reasoning models at this tier but still faster than GPT-5.3 (82 seconds TTFT) and GPT-5 mini (74 seconds TTFT)
  • Context window: Like other GPT-5 series models, GPT-5.4 Nano supports a 400k-token context window, matching GPT-5.4 Mini and GPT-5 mini but smaller than GPT-4.1 Nano's 1M-token window
  • Capabilities: GPT-5.4 Nano is a reasoning model supporting text and image inputs with text output. It includes extended thinking and supports a 90% cache discount to $0.02 per 1M tokens for cache hits

GPT-5.4 Nano is the right choice for teams that need reasoning capability without the cost of mid-tier or flagship models, particularly for agentic subtasks, structured extraction with logic requirements, or image-plus-text reasoning at scale.

What models should I consider using alongside GPT-5.4 Nano?

No single model is optimal for every task. Here are models worth pairing with GPT-5.4 Nano depending on what your product needs:

  • GPT-5.4 Mini: When a task requires more reasoning depth than GPT-5.4 Nano reliably delivers, routing to GPT-5.4 Mini gives you a 5-point Intelligence Index lift for roughly three to four times the per-token cost, a worthwhile trade for high-value requests
  • GPT-4.1 Nano: For simple classification, routing, or formatting tasks within a larger pipeline, GPT-4.1 Nano handles them at half the input cost of GPT-5.4 Nano, preserving budget for the requests that actually need reasoning
  • Claude Haiku: For high-frequency instruction-following workloads where structured output formatting is the primary requirement, Claude Haiku provides consistent results at comparable cost with different failure modes than OpenAI's models
  • Gemini 2.0 Flash: For pipelines that process a large number of images or multimodal inputs at high volume, Gemini 2.0 Flash handles vision-heavy tasks efficiently at competitive pricing
  • GPT-OSS 20B: For workloads that can tolerate open-weight inference infrastructure, GPT-OSS 20B delivers reasoning at $0.05 per 1M input tokens with Apache 2.0 licensing, useful for cost-sensitive or self-hosted environments

What are the challenges of using GPT-5.4 Nano in my product?

Like any production LLM, GPT-5.4 Nano comes with tradeoffs worth planning for:

  • Higher latency than non-reasoning alternatives: The 5.54-second time to first token is at the high end for reasoning models in this cost tier and makes GPT-5.4 Nano unsuitable for synchronous user-facing applications that require sub-second response initiation
  • High verbosity: GPT-5.4 Nano generated 210M output tokens during benchmarking evaluation, notably more than comparable models. In production, this tendency toward verbose output increases output token costs and can make response parsing less predictable without explicit length constraints in the system prompt
  • Context window smaller than non-reasoning siblings: At 400k tokens, GPT-5.4 Nano's context window is smaller than GPT-4.1 Nano's 1M-token window. Workloads that require processing very long documents in a single request may need to route to a non-reasoning model instead
  • Provider dependency: A pipeline built exclusively on GPT-5.4 Nano has no resilience if OpenAI experiences an outage or deprecates the model version. Without a configured fallback, disruptions affect all downstream requests simultaneously
  • Cost at scale: At $1.25 per 1M output tokens combined with a high verbosity profile, output costs accumulate faster than they would with less verbose models. Active budget governance and token limits are necessary to keep spend predictable at high volume

Why should I use Merge Gateway to route LLM requests with GPT-5.4 Nano and every other model?

Using GPT-5.4 Nano through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

  • One API, every provider: Access GPT-5.4 Nano and every other major LLM through a single endpoint and API key. Change providers by swapping the model string—no application code changes required
  • Intelligent routing and automatic failover: Merge routes around OpenAI outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40–60% without touching your application code
  • Cost governance: Set hard or soft project budgets so GPT-5.4 Nano spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers
  • Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision
  • Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches OpenAI. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with GPT-5.4 Nano?

Getting GPT-5.4 Nano running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For GPT-5.4 Nano, the model string is openai/gpt-5.4-nano. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming GPT-5.4 Nano as primary with GPT-5.4 Mini as a fallback for higher-complexity requests.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try GPT-5.4 Nano through Merge Gateway

Route, observe, and control AI requests across providers from one API.