GPT-5.4 is a OpenAI model available through Merge Gateway via OpenAI. Use it with Gateway routing policies, spend controls, request logs, and a 1,050,000 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

GPT-5.4 pricing
Test GPT-5.4 with Merge Gateway’s Simulator

Ready to try it out?
Start routing requests to hundreds of large language models in your product within minutes.

Route requests to GPT-5.4 with Merge Gateway
1{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
111{
2 "mcpServers": {
3 "agent-handler": {
4 "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5 "headers": {
6 "Authorization": "Bearer yMt*****"
7 }
8 }
9 }
10}
11Explore other models available in Merge Gateway
GPT-5.4 FAQ
Heading
What other models does OpenAI offer?
OpenAI's model lineup spans cost-efficient mini models, general-purpose options, and frontier reasoning models. Here are some other models OpenAI supports:
- GPT-5 Mini: GPT-5 Mini is OpenAI's entry-level reasoning model, priced at $0.25 per million input tokens. It is designed for high-volume applications where cost efficiency is the primary constraint and task complexity is moderate
- GPT-5.1: GPT-5.1 is the first tier in OpenAI's GPT-5 reasoning series, scoring 48 on the Artificial Analysis Intelligence Index. It offers faster generation speed than GPT-5.4 and a lower price point, making it appropriate for reasoning tasks that don't require frontier-level capability
- GPT-5.2: GPT-5.2 is an intermediate reasoning model with a 400k context window and an Intelligence Index score of 51. OpenAI currently recommends GPT-5.4 over GPT-5.2 for new deployments, but GPT-5.2 remains available for teams that have built workflows around it
- GPT-4o: GPT-4o is OpenAI's multimodal general-purpose model, accepting text, image, and audio inputs with low latency. It is a strong choice for real-time or user-facing applications where response speed matters more than deep reasoning
- o3: o3 is OpenAI's dedicated reasoning model, optimized for scientific, mathematical, and code-intensive tasks through compute-heavy inference-time reasoning. It targets benchmark accuracy on problems that require exhaustive logical chains
How does GPT-5.4 differ from OpenAI's other models?
GPT-5.4 is OpenAI's highest-scoring model in the GPT-5 series and currently holds a top-10 position on the Artificial Analysis Intelligence Index.
- Intelligence Index ranking: GPT-5.4 scores 57 on the Artificial Analysis Intelligence Index, ranking #6 out of 150 evaluated models. This is the highest score in the GPT-5 series, 6 points above GPT-5.2 (51) and 9 points above GPT-5.1 (48)
- Context window: GPT-5.4 supports a 1.1 million token context window, the largest in the GPT-5 family. GPT-5.2 is capped at 400k tokens and GPT-5.1 at 272k tokens, making GPT-5.4 the only option for workloads involving very large codebases, lengthy legal documents, or large data sets processed in a single pass
- Pricing: Input is priced at $2.50 per million tokens and output at $15.00 per million tokens. This makes GPT-5.4 the most expensive model in the GPT-5 lineup, roughly 2x the input cost of GPT-5.1 and 1.43x the input cost of GPT-5.2
- Speed: GPT-5.4 generates 79.1 tokens per second, comparable to GPT-5.2 but significantly slower than GPT-5.1 (124.4 t/s). Its time to first token of 190.91 seconds is the highest in the series, reflecting the depth of its reasoning chains
- Capabilities: Like other GPT-5 models, GPT-5.4 accepts text and image inputs and produces text output. Its extended thinking capability is the most developed in the series, contributing to its top-tier benchmark position
GPT-5.4 is the right choice when you need the highest available reasoning accuracy from OpenAI and your workload involves very long contexts, complex multi-step tasks, or async pipelines where a 3+ minute time to first token is acceptable.
What models should I consider using alongside GPT-5.4?
No single model is optimal for every task. Here are models worth pairing with GPT-5.4 depending on what your product needs:
- GPT-5 Mini: GPT-5 Mini at $0.25 per million input tokens is 10x cheaper on input than GPT-5.4. Route routine tasks like classification, entity extraction, or light summarization to GPT-5 Mini, and reserve GPT-5.4 for requests that genuinely require frontier reasoning
- GPT-5.1: For reasoning tasks that fall below GPT-5.4's maximum capability, GPT-5.1 generates output at 124.4 tokens per second, more than 1.5x faster than GPT-5.4, at roughly half the input cost. A smart routing layer can push mid-complexity requests to GPT-5.1 automatically
- Claude Opus 4: Anthropic's flagship model is a strong complement for tasks involving nuanced instruction following, long-form writing, and multi-turn dialogue. Pairing GPT-5.4 with Claude Opus 4 gives your product cross-provider redundancy and a high-quality fallback if OpenAI is unavailable
- Gemini 2.0 Flash: Google's Gemini 2.0 Flash is built for high-throughput, low-latency inference. When your application needs a response in seconds rather than minutes, routing those requests to Gemini 2.0 Flash avoids the 190-second time-to-first-token overhead that GPT-5.4 carries
- Mistral Large: For applications operating under European data residency requirements, Mistral Large provides strong reasoning capability from EU-hosted infrastructure, complementing GPT-5.4 for global deployments with regional compliance constraints.
What are the challenges of using GPT-5.4 in my product?
Like any production LLM, GPT-5.4 comes with tradeoffs worth planning for:
- Extreme time to first token: GPT-5.4's time to first token is 190.91 seconds, the highest in the GPT-5 family and among the highest of any evaluated model. This rules it out for any synchronous or user-facing interaction and requires async infrastructure with appropriate timeout handling
- High cost at scale: At $15.00 per million output tokens, GPT-5.4's output pricing is the highest in the GPT-5 series. Combined with its verbosity (120 million output tokens generated during Intelligence Index evaluation), costs can grow significantly faster than expected at scale
- Verbosity: GPT-5.4 is characterized as "very verbose" by Artificial Analysis, producing lengthy outputs even for tasks that could be answered concisely. If your downstream systems expect short, structured responses, you will need explicit output formatting instructions and potentially post-processing to trim responses
- Provider dependency: OpenAI has already released GPT-5.5 as GPT-5.4's successor. Building tightly around a single model version from a single provider creates migration risk when the model is deprecated and operational risk if OpenAI experiences an outage
- Infrastructure requirements for async pipelines: GPT-5.4's latency profile means your application must queue requests, manage callbacks or polling, and handle timeouts gracefully. Teams expecting drop-in synchronous use will need to invest in pipeline architecture before GPT-5.4 is production-ready for their product
Why should I use Merge Gateway to route LLM requests with GPT-5.4 and every other model?
Using GPT-5.4 through Merge Gateway gives you access to the model itself and the infrastructure layer around it:
- One API, every provider: Access GPT-5.4 and every other major LLM through a single endpoint and API key. Change providers by swapping the model string—no application code changes required
- Intelligent routing and automatic failover: Merge routes around OpenAI outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40–60% without touching your application code
- Cost governance: Set hard or soft project budgets so GPT-5.4 spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers
- Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision
- Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches OpenAI. Enforce per-project model and region policies without adding that logic to your application
How can I start using Merge Gateway to route requests with GPT-5.4?
Getting GPT-5.4 running through Merge Gateway takes a few minutes:
1. Create an account and get your API key from the dashboard.
2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.
3. Make your first request using the provider/model format. For GPT-5.4, the model string is openai/gpt-5.4. Swap the model string to route to any other provider without changing anything else.
4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming GPT-5.4 as primary with one fallback.
Full setup instructions and SDK references are in the Merge Gateway docs.
Try GPT-5.4 through Merge Gateway
Route, observe, and control AI requests across providers from one API.




