GPT-5 Mini:
Everything you need to know about the model

GPT-5 Mini is a OpenAI model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 272,000 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

GPT-5 Mini performance*

Intelligence - general reasoning and knowledge
41%
Coding - code generation and problem-solving
35%
*Performance data is provided by Artificial Analysis and is subject to change.

GPT-5 Mini pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | OpenAI | $0.2500 | $2.00 | Yes |

Test GPT-5 Mini with Merge Gateway’s Simulator

GPT-5 Mini
Synced
Synced
Run simulation to see response

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Route requests to GPT-5 Mini with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to GPT-5 Mini and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.
To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Install the Merge Gateway SDK
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Make your first API call
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11
Try a diffrent model
Python
1{
2  "mcpServers": {
3    "agent-handler": {
4      "url": "https://ah-api-develop.merge.dev/api/v1/tool-packs/{TOOL_PACK_ID}/registered-users/{REGISTERED_USER_ID}/mcp",
5      "headers": {
6        "Authorization": "Bearer yMt*****"
7      }
8    }
9  }
10}
11

Explore other models available in Merge Gateway

model logo
GPT-5.2
model logo
GPT-5.4
model logo
Gpt 5.4 Mini
model logo
GPT-5.4 Nano
model logo
GPT-5.5
model logo
Gpt 5 Nano
model logo
Grok 3
model logo
Grok 3 Mini
model logo
Grok 4 0709
model logo
Grok 4 1 Fast Non Reasoning
model logo
Grok 4 1 Fast Reasoning
model logo
Grok 4.20
model logo
Grok 4.3
model logo
Grok 4 Fast Non Reasoning
model logo
Grok 4 Fast Reasoning
model logo
Grok Code Fast 1
model logo
Jamba 1.5 Large
model logo
Jamba 1.5 Mini
model logo
Kimi K2 0711 Preview
model logo
Kimi K2 0905 Preview
model logo
Kimi K2.5
model logo
Kimi K2.6
model logo
Kimi K2 Thinking
model logo
Kimi K2 Thinking

GPT-5 Mini FAQ

For anyone with more questions about GPT-5 Mini, we've covered a few more below. The details here reflect what was known in June, 2026, and are subject to change.

Heading

What other models does OpenAI offer?

OpenAI's model catalog covers a range of price and capability tiers, from cost-efficient mini models to frontier reasoning systems. Here are some other models OpenAI supports:

  • GPT-5.1: GPT-5.1 is the entry-level model in the GPT-5 full reasoning series, scoring 48 on the Artificial Analysis Intelligence Index. Priced at $1.25 per million input tokens, it is designed for tasks requiring stronger reasoning than GPT-5 Mini can provide, while remaining cheaper than higher-tier options
  • GPT-5.2: GPT-5.2 is a mid-tier reasoning model with a 400k-token context window and an Intelligence Index score of 51. It is a step up from GPT-5.1 in both capability and cost, suited for complex multi-step tasks where a larger context window is needed
  • GPT-5.4: GPT-5.4 is OpenAI's current flagship reasoning model, ranking #6 out of 150 models on the Artificial Analysis Intelligence Index. Its 1.1 million token context window and top-tier benchmark performance make it OpenAI's highest-capability option for production reasoning tasks
  • GPT-4o: GPT-4o is OpenAI's general-purpose multimodal model, accepting text, image, and audio inputs with fast response times. It is a strong fit for real-time and conversational applications where speed and breadth of modality matter more than deep reasoning chains
  • o3: o3 is OpenAI's specialized reasoning model, built for scientific, mathematical, and advanced coding problems. It trades throughput for accuracy on difficult benchmarks through compute-intensive inference-time reasoning

How does GPT-5 Mini differ from OpenAI's other models?

GPT-5 Mini is OpenAI's lowest-cost reasoning model, designed for high-volume workloads where price efficiency is the primary constraint.

  • Pricing: GPT-5 Mini is priced at $0.25 per million input tokens and $2.00 per million output tokens. This is 5x cheaper on input than GPT-5.1 ($1.25) and 10x cheaper than GPT-5.4 ($2.50), making it the most accessible reasoning model OpenAI offers
  • Intelligence Index ranking: GPT-5 Mini scores 41 on the Artificial Analysis Intelligence Index, ranking #19 out of 161 comparable models. While this is well above average among models in its pricing tier, it trails GPT-5.1 (48), GPT-5.2 (51), and GPT-5.4 (57) in raw reasoning capability
  • Speed: GPT-5 Mini generates 96.5 tokens per second, roughly in line with the median for its class and faster than both GPT-5.2 (77.5 t/s) and GPT-5.4 (79.1 t/s). However, its time to first token of 72.75 seconds is substantially higher than the class median of 2.05 seconds, reflecting its reasoning overhead
  • Context window: GPT-5 Mini supports a 400k-token context window, matching GPT-5.2 and significantly larger than the 272k available in GPT-5.1. This is an advantage for cost-sensitive applications that also need to process long documents
  • Verbosity: GPT-5 Mini generated 69 million output tokens during the Intelligence Index evaluation, described as "very verbose" relative to the average of 27 million. This means output costs can be higher than the per-token rate alone would suggest

GPT-5 Mini is the right choice when you need extended thinking capability at scale and task complexity does not require the full reasoning depth of GPT-5.1 or above.

What models should I consider using alongside GPT-5 Mini?

No single model is optimal for every task. Here are models worth pairing with GPT-5 Mini depending on what your product needs:

  • GPT-5.1: For requests that exceed GPT-5 Mini's reasoning ceiling (multi-step planning, complex code generation, research synthesis), route to GPT-5.1. Its Intelligence Index score of 48 vs. GPT-5 Mini's 41 reflects meaningfully stronger reasoning capability at a cost that is still moderate relative to GPT-5.4
  • GPT-5.4: When your pipeline encounters truly hard problems requiring frontier-level reasoning, GPT-5.4 (Intelligence Index score 57, ranked #6) is the appropriate escalation target. Use it selectively so costs remain controlled, with GPT-5 Mini handling the volume
  • Gemini 2.0 Flash: Google's Gemini 2.0 Flash is a fast, low-cost model well-suited for high-throughput inference. Pairing it with GPT-5 Mini gives you a cross-provider cost-efficiency layer and a failover option when OpenAI experiences rate limits or outages
  • Claude Haiku 3.5: Anthropic's Claude Haiku 3.5 is optimized for fast, cheap inference on structured tasks like classification, extraction, and short-form generation. It can serve as an alternative or complement to GPT-5 Mini for workloads where reasoning depth is not needed at all
  • Mistral 7B Instruct: For extremely high-volume, low-stakes inference where even GPT-5 Mini's pricing is a concern, Mistral 7B Instruct provides open-weight capability that can be self-hosted or accessed via API at very low cost per token.

What are the challenges of using GPT-5 Mini in my product?

Like any production LLM, GPT-5 Mini comes with tradeoffs worth planning for:

  • Reasoning ceiling: GPT-5 Mini's Intelligence Index score of 41 means it will underperform on tasks that require deep multi-step reasoning, complex code architecture, or rigorous scientific analysis. Routing those requests to GPT-5 Mini without a smarter escalation policy will produce lower-quality outputs than the task warrants
  • High time to first token: GPT-5 Mini's time to first token of 72.75 seconds is far above the class median of 2.05 seconds. This makes it unsuitable for synchronous, user-facing use cases where response delay is visible, even though its generation speed after the first token is competitive
  • Verbosity driving output costs: GPT-5 Mini is "very verbose" relative to the average model in its tier. Because output tokens cost $2.00 per million, lengthy responses can erode the per-task cost savings that make GPT-5 Mini attractive in the first place. Prompt engineering to constrain output length is worth investing in early
  • Outdated knowledge cutoff: GPT-5 Mini's training data cuts off at May 30, 2024, making it one of the older cutoffs in the GPT-5 family. Applications that reference events or documentation from mid-2024 onward will need retrieval augmentation or should route those queries to a model with a more current cutoff
  • Provider dependency: OpenAI has already flagged GPT-5.4 Mini as the successor to GPT-5 Mini. Relying on a single model version without a fallback routing policy creates deprecation risk and leaves your application exposed to any OpenAI service disruption.

Why should I use Merge Gateway to route LLM requests with GPT-5 Mini and every other model?

Using GPT-5 Mini through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

  • One API, every provider: Access GPT-5 Mini and every other major LLM through a single endpoint and API key. Change providers by swapping the model string — no application code changes required
  • Intelligent routing and automatic failover: Merge routes around OpenAI outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40–60% without touching your application code
  • Cost governance: Set hard or soft project budgets so GPT-5 Mini spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers
  • Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision
  • Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches OpenAI. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with GPT-5 Mini?

Getting GPT-5 Mini running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For GPT-5 Mini, the model string is openai/gpt-5-mini. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming GPT-5 Mini as primary with one fallback.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try GPT-5 Mini through Merge Gateway

Route, observe, and control AI requests across providers from one API.