Claude Sonnet 4.5 API: pricing, performance, and how to route requests

Claude Sonnet 4.5:
Everything you need to know about the model

Claude Sonnet 4.5 is an Anthropic model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 200,000 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

Claude Sonnet 4.5 pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Amazon Bedrock | $3.00 | $15.00 | Yes | | Anthropic | $3.00 | $15.00 | No |

Test Claude Sonnet 4.5 with Merge Gateway’s Simulator

Claude Sonnet 4.5

Model

System prompt

Synced

User message

Synced

Response

Run simulation to see response

Cost

—

Tokens

—

Latency

—

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Start building for free

Get a demo

Route requests to Claude Sonnet 4.5 with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to Claude Sonnet 4.5 and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.

To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.

Install the Merge Gateway SDK

Python

1$ pip install merge-gateway-sdk

Send a request

Python

1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)

Try a diffrent model

Swap the model string to route to a different provider. No other code changes needed.

Anthropic

1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)

Point to Gateway

Python

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)

Send a request

Use the standard chat.completions.create method. No provider prefix needed on the model name.

Python

1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)

Install packages

1npm install merge-gateway-ai-sdk-provider ai

Create the provider

TypeScript

1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});

Send a request

Use generateText to send a request. Model names use the provider/model format.

TypeScript

1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);

If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:

TypeScript

1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged

Install the Merge Gateway SDK

Anthropic SDK

1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

Qwen3-VL Flash

Qwen3-VL Plus

Qwen Flash

Qwen Plus

Titan Embed Text V2

Titan Text Large

UI-TARS-1.5-7B

Claude Sonnet 4.5 FAQ

If you have additional questions about Claude Sonnet 4.5, we've answered a few more below. This information was written in June, 2026 and is subject to change.

Heading

What other models does Anthropic offer?

Anthropic's lineup spans three tiers: a cost-efficient high-speed model, a mid-tier general-purpose model, and a flagship reasoning-capable model. Here are some other models Anthropic supports:

Claude Haiku 4.5 is Anthropic's fastest and most affordable model, priced at $1.00 input and $5.00 output per million tokens. It is built for high-throughput, latency-sensitive tasks where speed and cost matter more than maximum intelligence depth

Claude Sonnet 4.6 is the successor to Claude Sonnet 4.5, sitting in the same mid-tier position with improved benchmark performance and an expanded context window of 1 million tokens. Teams on Sonnet 4.5 looking to upgrade will find Sonnet 4.6 to be the most direct path forward at the same price point

Claude Opus 4.6 is a higher-tier Opus model that delivers stronger reasoning and instruction-following than any Sonnet variant. It is priced at $5.00 input and $25.00 output per million tokens, making it appropriate for tasks where output quality justifies the premium

Claude Opus 4.8 is Anthropic's current flagship, also priced at $5.00 input and $25.00 output per million tokens. It leads Anthropic's lineup in reasoning, coding, and long-horizon agentic tasks, and is the recommended choice when maximum intelligence is the priority

How does Claude Sonnet 4.5 differ from Anthropic's other models?

Claude Sonnet 4.5 sits in the mid-tier of Anthropic's lineup, offering a balance of speed and capability between the Haiku and Opus tiers.

Pricing: Claude Sonnet 4.5 is priced at $3.00 per million input tokens and $15.00 per million output tokens. That is three times the input cost of Claude Haiku 4.5 ($1.00 input) and 40% lower input cost than Claude Opus 4.8 ($5.00 input)

Context window: Claude Sonnet 4.5 supports a 200,000 token context window, which is the same as Claude Haiku 4.5 but significantly smaller than the 1 million token context window available on Claude Sonnet 4.6 and the Opus models

Knowledge cutoff: Claude Sonnet 4.5 has a reliable knowledge cutoff of January 2025, which is earlier than Claude Sonnet 4.6's August 2025 reliable cutoff. Teams building on current events or recent documentation should account for this difference

Extended thinking: Claude Sonnet 4.5 supports extended thinking, allowing it to reason through complex problems before producing a final answer. This capability is not available on Claude Haiku 4.5

Latency: Claude Sonnet 4.5 is classified as a fast model, matching Sonnet 4.6 in comparative latency and sitting between Haiku 4.5 (fastest) and the Opus models (moderate latency)

Modalities: Claude Sonnet 4.5 accepts text and image inputs and produces text output, matching the input modality support of all current Claude 4-series models

Claude Sonnet 4.5 is a good fit for mid-complexity workloads where extended thinking support matters, pricing parity with Sonnet 4.6 is acceptable, and the 200k context window is sufficient for the use case.

What models should I consider using alongside Claude Sonnet 4.5?

No single model is optimal for every task. Here are models worth pairing with Claude Sonnet 4.5 depending on what your product needs:

Claude Haiku 4.5: For classification, extraction, short summaries, or any routine task where intelligence depth isn't critical, routing to Haiku 4.5 at $1.00 per million input tokens can reduce costs substantially while keeping latency low

Claude Sonnet 4.6: If your workloads frequently involve documents or contexts over 200k tokens, Claude Sonnet 4.6's 1 million token context window makes it the natural pairing. Route long-context requests to Sonnet 4.6 and keep Sonnet 4.5 for standard-length completions

Claude Opus 4.8: For the most demanding tasks, such as complex multi-step code generation, deep research synthesis, or high-stakes reasoning, Claude Opus 4.8 provides a ceiling that Sonnet 4.5 doesn't reach. Use Sonnet 4.5 as the default and escalate to Opus only where output quality requires it

GPT-5 Mini: For high-volume OpenAI-architecture workloads running on tight budgets, GPT-5 Mini provides cost-efficient inference as a fallback or secondary path alongside Claude Sonnet 4.5

Gemini 3 Flash: For streaming interfaces or real-time completions where time to first token is the primary constraint, Gemini 3 Flash offers very fast output at a competitive price. Pair it with Sonnet 4.5 for heavier non-latency-critical requests

What are the challenges of using Claude Sonnet 4.5 in my product?

Like any production LLM, Claude Sonnet 4.5 comes with tradeoffs worth planning for:

Limited context window: The 200,000 token context window is sufficient for most use cases but will become a bottleneck for long document processing, large code repositories, or retrieval-heavy workflows. Claude Sonnet 4.6 offers 1 million tokens at the same price point if this is a concern

Earlier knowledge cutoff: Claude Sonnet 4.5's reliable knowledge cutoff is January 2025, which means it lacks awareness of events and information from early 2025 onward. For workloads that depend on current data, this requires supplementing with retrieval or routing to a more recently trained model

Legacy model status: Claude Sonnet 4.5 is classified as a legacy model by Anthropic. While it isn't deprecated, Anthropic's active development focus is on newer generations, and teams should plan a migration path to Sonnet 4.6 when ready

Provider dependency: Running exclusively on Anthropic creates fragility when the provider experiences an outage, rate limits your traffic, or deprecates a model version. This risk compounds when relying on a legacy model that isn't on the active development roadmap

Cost at scale: At $3.00 input and $15.00 output per million tokens, Claude Sonnet 4.5 isn't a budget option. Without project-level spend caps and routing controls, token costs scale quickly as request volume grows

Why should I use Merge Gateway to route LLM requests with Claude Sonnet 4.5 and every other model?

Using Claude Sonnet 4.5 through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

One API, every provider: Access Claude Sonnet 4.5 and every other major LLM through a single endpoint and API key. Change providers by swapping the model string, with no application code changes required

Intelligent routing and automatic failover: Merge routes around Anthropic outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40-60% without touching your application code. Given Sonnet 4.5's legacy status, having an automatic fallback to Sonnet 4.6 or another model is especially valuable

Cost governance: Set hard or soft project budgets so Claude Sonnet 4.5 spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers

Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision

Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches Anthropic. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with Claude Sonnet 4.5?

Getting Claude Sonnet 4.5 running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For Claude Sonnet 4.5, the model string is anthropic/claude-sonnet-4-5-20250929. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. A practical starting point for Claude Sonnet 4.5: set Sonnet 4.6 as a fallback to handle any requests that exceed the 200k context limit or require a more recent knowledge cutoff.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Claude Sonnet 4.5 through Merge Gateway

Route, observe, and control AI requests across providers from one API.

Start building for free

Get a demo