GPT-5.4 Mini:
Everything you need to know about the model

GPT-5.4 Mini is a OpenAI model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 272,000 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

GPT-5.4 Mini pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | OpenAI | $0.7500 | $4.50 | Yes |

Test GPT-5.4 Mini with Merge Gateway’s Simulator

GPT-5.4 Mini
Synced
Synced
Run simulation to see response

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Route requests to GPT-5.4 Mini with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to GPT-5.4 Mini and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.
To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.
Install the Merge Gateway SDK
Python
1$ pip install merge-gateway-sdk
Send a request
Python
1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)
Try a diffrent model
Swap the model string to route to a different provider. No other code changes needed.
Anthropic
1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
Point to Gateway
Python
1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)
Send a request
Use the standard chat.completions.create method. No provider prefix needed on the model name.
Python
1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)
Install packages
1npm install merge-gateway-ai-sdk-provider ai
Create the provider
TypeScript
1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});
Send a request
Use generateText to send a request. Model names use the provider/model format.
TypeScript
1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);
If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:
TypeScript
1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged
Install the Merge Gateway SDK
Anthropic SDK
1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

model logo
Qwen3 30B A3B Instruct 2507
model logo
Qwen3-32B
model logo
Qwen3.5 122B A10B
model logo
Qwen3.5 27B
model logo
Qwen3.5-35B-A3B
model logo
Qwen3.5-397B-A17B
model logo
Qwen3.5 Flash
model logo
Qwen3.5 Plus
model logo
Qwen3.6 35B A3B
model logo
Qwen3.6 Flash
model logo
Qwen3.6 Plus
model logo
Qwen3.7 Max
model logo
Qwen3 8B
model logo
Qwen3-Coder-30B-A3B-Instruct
model logo
Qwen3-Coder-480B
model logo
Qwen3 Coder Flash
model logo
Qwen3 Coder Plus
model logo
Qwen3 Max
model logo
Qwen3 Next 80B A3B Thinking
model logo
Qwen 3 Next 80B Instruct
model logo
Qwen3-VL 235B A22B Thinking
model logo
Qwen3-VL 30B-A3B Instruct
model logo
Qwen3-VL 32B Instruct
model logo
Qwen3-VL-8B-Instruct

GPT-5.4 Mini FAQ

Have more questions about GPT-5.4 Mini? We've answered a few more below. Please note that this information was written in June, 2026 and is subject to change.

Heading

What other models does OpenAI offer?

OpenAI's lineup spans cost-optimized, general-purpose, and frontier reasoning tiers. Here are some other models OpenAI supports:

  • GPT-5.4 Nano: GPT-5.4 Nano is the cost-optimized sibling in the GPT-5.4 series, priced at $0.20 input and $1.25 output per million tokens. It scores 44 on the Intelligence Index and delivers a faster time to first token (6.33s) than GPT-5.4 Mini, making it the right choice when cost and speed are the primary constraints
  • GPT-4.1 Mini: GPT-4.1 Mini is a non-reasoning model at $0.40 input and $1.60 output per million tokens with a 1M token context window. It is faster to first token and cheaper than GPT-5.4 Mini, suited for tasks where reasoning traces are not required
  • GPT-5: GPT-5 is the standard mid-range model in the GPT-5 series with a 400k token context window, occupying the tier below the GPT-5.4 series
  • GPT-5.4: GPT-5.4 is the flagship in the GPT-5.4 series, scoring 57 on the Intelligence Index at $2.50 input and $15.00 output per million tokens. It delivers significantly higher capability than GPT-5.4 Mini at a higher price
  • GPT-5.5: GPT-5.5 is OpenAI's current highest-capability model at $5.00 input and $30.00 output per million tokens, ranked #2 on the Artificial Analysis Intelligence Index with a score of 60

How does GPT-5.4 Mini differ from OpenAI's other models?

GPT-5.4 Mini is OpenAI's mid-tier reasoning model in the GPT-5.4 series, combining meaningful intelligence with affordable pricing and official API support.

  • Pricing: GPT-5.4 Mini is priced at $0.75 per million input tokens and $4.50 per million output tokens. This is officially confirmed on OpenAI's pricing page, making it a reliable choice for production budgeting
  • Intelligence ranking: GPT-5.4 Mini scores 49 on the Artificial Analysis Intelligence Index, placing it #27 out of 150 evaluated models. This is higher than GPT-5 Mini at 41 and GPT-5.4 Nano at 44, while staying below GPT-5.4 at 57
  • Speed: Output speed is 164.4 tokens per second, one of the faster reasoning models. Time to first token is 12.81 seconds, substantially faster than the GPT-5 Mini (99.53s) it replaces for most workloads
  • Context window: GPT-5.4 Mini supports 400,000 input tokens, consistent with the rest of the GPT-5.4 series
  • Knowledge cutoff: GPT-5.4 Mini has a knowledge cutoff of August 2025, more than a year newer than GPT-5 Mini's May 2024 cutoff
  • Capabilities: Accepts text and image inputs, produces text output. Reasoning model released March 17, 2026

GPT-5.4 Mini is the right choice for teams that need reasoning-capable output with official pricing support, a reasonably fast TTFT, and intelligence meaningfully above the Nano tier.

What models should I consider using alongside GPT-5.4 Mini?

No single model is optimal for every task. Here are models worth pairing with GPT-5.4 Mini depending on what your product needs:

  • GPT-5.4 Nano: For the majority of requests that don't require GPT-5.4 Mini's intelligence tier, route to GPT-5.4 Nano at $0.20 input per million tokens. The Nano's TTFT is also faster at 6.33 seconds, making it viable for near-real-time batch workloads
  • GPT-5.4: When a task exceeds GPT-5.4 Mini's capability, escalate to GPT-5.4 at $2.50 input per million tokens for a 16-point Intelligence Index improvement. Use it selectively for ceiling tasks
  • GPT-4.1 Mini: For workloads that don't benefit from reasoning traces and need a larger context window, GPT-4.1 Mini's 1M token context at $0.40 input is a cost-efficient alternative
  • Claude Sonnet 4.6: For cross-provider diversity at a similar intelligence tier, Claude Sonnet 4.6 scores 44 on the Intelligence Index with a 1M token context window and competitive pricing from Anthropic
  • Mistral Small: For European data-residency workloads or cost-sensitive use cases where Mistral's infrastructure is preferred, Mistral Small provides a capable lightweight alternative in a similar price tier

What are the challenges of using GPT-5.4 Mini in my product?

Like any production LLM, GPT-5.4 Mini comes with tradeoffs worth planning for:

  • High time to first token for interactive use: At 12.81 seconds TTFT, GPT-5.4 Mini is too slow for synchronous user-facing interfaces. It works well for batch and async pipelines but cannot serve real-time chat or streaming UI patterns
  • Output verbosity: GPT-5.4 Mini generated 240 million output tokens during evaluation, which is among the higher verbosity levels in the dataset. At $4.50 per million output tokens, response length has a direct and significant cost impact
  • Smaller context window than alternatives: The 400,000 token context window is smaller than the 1M context available on GPT-4.1 Mini and GPT-4.1 Nano. Applications with very long documents, large codebases, or extended multi-turn histories may need a different model
  • Cost at scale: At $0.75 input and $4.50 output per million tokens, GPT-5.4 Mini is not a budget model. Combined with its verbosity, cost at production volume can be higher than initial estimates suggest
  • Provider dependency: Running exclusively on OpenAI creates fragility from outages, rate limits, or future model deprecations

Why should I use Merge Gateway to route LLM requests with GPT-5.4 Mini and every other model?

Using GPT-5.4 Mini through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

  • One API, every provider: Access GPT-5.4 Mini and every other major LLM through a single endpoint and API key. Change providers by swapping the model string, with no application code changes required
  • Intelligent routing and automatic failover: Merge routes around OpenAI outages automatically. Routing policies based on cost, latency, or quality can reduce spend by routing simpler tasks to GPT-5.4 Nano or GPT-4.1 Mini without code changes
  • Cost governance: Set hard or soft project budgets so GPT-5.4 Mini spend stays within plan. Given its output verbosity, budget caps help prevent cost overruns. Every request is attributed to a model, project, and tag in a unified billing dashboard
  • Build Your Own Router: Score models against your own benchmark weights to determine when GPT-5.4 Mini is worth the cost versus GPT-5.4 Nano. Every routing decision includes a plain-language explanation of why a model was chosen
  • Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches OpenAI. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with GPT-5.4 Mini?

Getting GPT-5.4 Mini running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For GPT-5.4 Mini, the model string is openai/gpt-5.4-mini. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. A practical starting point: use GPT-5.4 Nano as the default and escalate to GPT-5.4 Mini for tasks that need a higher intelligence tier, with GPT-5.4 as a further escalation option.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Gpt 5.4 Mini through Merge Gateway

Route, observe, and control AI requests across providers from one API.