Claude Opus 4.8:
Everything you need to know about the model

Claude Opus 4.8 is an Anthropic model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 1,000,000 token context window. It supports streaming, structured outputs, tool calling, vision through at least one Gateway vendor route.

Claude Opus 4.8 performance*

Intelligence - general reasoning and knowledge
61%
Coding - code generation and problem-solving
57%
*Performance data is provided by Artificial Analysis and is subject to change.

Claude Opus 4.8 pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Anthropic | $5.00 | $25.00 | No |

Test Claude Opus 4.8 with Merge Gateway’s Simulator

Claude Opus 4.8
Synced
Synced
Run simulation to see response

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Route requests to Claude Opus 4.8 with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to Claude Opus 4.8 and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.
To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.
Install the Merge Gateway SDK
Python
1$ pip install merge-gateway-sdk
Send a request
Python
1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)
Try a diffrent model
Swap the model string to route to a different provider. No other code changes needed.
Anthropic
1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
Point to Gateway
Python
1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)
Send a request
Use the standard chat.completions.create method. No provider prefix needed on the model name.
Python
1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)
Install packages
1npm install merge-gateway-ai-sdk-provider ai
Create the provider
TypeScript
1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});
Send a request
Use generateText to send a request. Model names use the provider/model format.
TypeScript
1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);
If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:
TypeScript
1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged
Install the Merge Gateway SDK
Anthropic SDK
1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

model logo
Amazon Nova 2 Lite
model logo
Amazon.Nova 2 Sonic
model logo
Amazon Nova Premier
model logo
Amazon Nova Pro
model logo
Claude Opus 4.6
model logo
Claude Opus 4.7
model logo
Claude Sonnet 4 6
model logo
Codestral
model logo
Codestral 25.08
model logo
DeepSeek V3
model logo
DeepSeek V3.2
model logo
DeepSeek V4 Flash
model logo
DeepSeek V4 Pro
model logo
Devstral 2512
model logo
Dola Seed 2.0 Code (preview)
model logo
Dola Seed 2.0 Lite
model logo
Dola Seed 2.0 Mini
model logo
Dola Seed 2.0 Pro
model logo
Gemini 2.5 Flash
model logo
Gemini 2.5 Flash Lite
model logo
Gemini 2.5 Pro
model logo
Gemini 3.1 Flash Lite
model logo
Gemini 3.1 Pro Preview
model logo
Gemini 3.5 Flash

Claude Opus 4.8 FAQ

Have more questions about Claude Opus 4.8? We've answered a few more below. It's worth noting that this information was written in June, 2026 and is subject to change.

Heading

What other models does Anthropic offer?

Anthropic's lineup spans three distinct tiers designed to balance intelligence, speed, and cost. Here are some other models Anthropic supports:

  • Claude Haiku 4.5: Claude Haiku 4.5 is Anthropic's fastest and most affordable model at $1.00 input and $5.00 output per million tokens. It is built for high-throughput, low-latency workloads where cost efficiency outweighs peak intelligence
  • Claude Sonnet 4.6: Claude Sonnet 4.6 is Anthropic's mid-tier model at $3.00 input and $15.00 output per million tokens. It delivers strong benchmark performance and a 1M token context window at roughly 60% lower input cost than Opus 4.8, making it the right default for most production workloads
  • Claude Opus 4.6: Claude Opus 4.6 is Anthropic's non-reasoning Opus variant, positioned just below Opus 4.8 on the Intelligence Index. It provides near-flagship intelligence without extended thinking traces and at a faster response speed than Opus 4.8

How does Claude Opus 4.8 differ from Anthropic's other models?

Claude Opus 4.8 is Anthropic's flagship reasoning model and the highest-ranked model on the Artificial Analysis Intelligence Index across all providers.

  • Intelligence ranking: Claude Opus 4.8 scores 61 on the Artificial Analysis Intelligence Index, ranking #1 out of 150+ evaluated models. This is the top score across all providers, including GPT-5.5 at score 60
  • Pricing: Claude Opus 4.8 is priced at $5.00 per million input tokens and $25.00 per million output tokens. That is 67% higher input cost than Claude Sonnet 4.6 and 5x the input cost of Claude Haiku 4.5
  • Model type: Claude Opus 4.8 is a reasoning model with adaptive extended thinking. It generates chain-of-thought traces before producing a final answer, which contributes to its top benchmark scores but also to higher latency and verbose output
  • Speed and latency: Output speed is 65.6 tokens per second, but time to first token is 10.39 seconds, well above the median of 2.71 seconds across all models. Extended thinking is the primary driver of this latency
  • Output verbosity: Claude Opus 4.8 generated 110 million output tokens during Artificial Analysis evaluation, versus a median of 35 million across all models. At $25.00 per million output tokens, this verbosity has major cost implications at scale
  • Context window: Supports 1,000,000 input tokens, matching Claude Sonnet 4.6 and Opus 4.6

Claude Opus 4.8 is the right choice when task difficulty genuinely requires the top available intelligence, extended reasoning traces are acceptable or desirable, and latency is not a hard constraint.

What models should I consider using alongside Claude Opus 4.8?

No single model is optimal for every task. Here are models worth pairing with Claude Opus 4.8 depending on what your product needs:

  • Claude Sonnet 4.6: Route the majority of your traffic to Sonnet 4.6 at $3.00 input per million tokens and escalate to Opus 4.8 only for tasks that demonstrably exceed Sonnet's capability. This alone can reduce Anthropic costs by 40% or more on mixed workloads
  • Claude Haiku 4.5: For classification, routing, short extraction, or any task with simple output requirements, Haiku 4.5 at $1.00 input is the right tier. Reserving Opus 4.8 for ceiling tasks keeps spend under control
  • GPT-5.5: For workloads where you want a second opinion from the other top-ranked frontier model, GPT-5.5 scores 60 on the Intelligence Index and provides cross-provider diversity at the capability ceiling
  • Gemini 3 Flash: For latency-sensitive requests where Opus 4.8's 10-second time to first token is unacceptable, Gemini 3 Flash delivers fast responses at lower cost. Use it for streaming interfaces or real-time interactions while Opus 4.8 handles batch and async workloads
  • Kimi K2.6: For open-weight workloads where you want near-frontier intelligence without proprietary provider lock-in, Kimi K2.6 ranks at the top of open models and provides a strong cost-efficient alternative for batch processing

What are the challenges of using Claude Opus 4.8 in my product?

Like any production LLM, Claude Opus 4.8 comes with tradeoffs worth planning for:

  • Extreme output verbosity: Claude Opus 4.8 generates roughly 3x the output tokens of the median model due to its extended thinking traces. At $25.00 per million output tokens, this verbosity is the single largest cost risk in production and requires active monitoring and budget controls
  • High latency: A time to first token of 10.39 seconds makes Opus 4.8 unsuitable for any interactive or streaming interface. This is a hard architectural constraint, not a tunable parameter
  • Output pricing at scale: At $5.00 input and $25.00 output per million tokens, Opus 4.8 is one of the most expensive models available. Running significant traffic through it without routing controls will produce very high bills quickly
  • Provider dependency: Running exclusively on Anthropic means outages, rate limits, or model deprecations directly impact your product. Anthropic has iterated through the Claude 4 series quickly, and availability is not guaranteed for any specific version
  • Cost at scale: The combination of high pricing and output verbosity means that Opus 4.8 should be reserved for tasks where it is genuinely necessary. Routing all traffic here is rarely justifiable on cost grounds

Why should I use Merge Gateway to route LLM requests with Claude Opus 4.8 and every other model?

Using Claude Opus 4.8 through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

  • One API, every provider: Access Claude Opus 4.8 and every other major LLM through a single endpoint and API key. Change providers by swapping the model string, with no application code changes required
  • Intelligent routing and automatic failover: Merge routes around Anthropic outages automatically. Given Opus 4.8's pricing and verbosity, routing even a fraction of requests to a cheaper model through ML-driven policies can reduce spend by 40–60%
  • Cost governance: Set hard or soft project budgets so Claude Opus 4.8 spend stays within plan. Given its output verbosity, hard limits are especially important. Every request is attributed to a model, project, and tag in a unified billing dashboard
  • Build Your Own Router: Define the intelligence threshold at which Opus 4.8 is worth the cost by scoring models against your own benchmark weights or eval scores. The router picks the winner per request with a plain-language explanation of every decision
  • Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches Anthropic. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with Claude Opus 4.8?

Getting Claude Opus 4.8 running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For Claude Opus 4.8, the model string is anthropic/claude-opus-4-8 (confirm the exact dated slug with Anthropic's API documentation). Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. A recommended starting point: set Claude Sonnet 4.6 as the default tier with Opus 4.8 as an escalation target for requests that exceed a defined complexity or quality threshold.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Claude Opus 4.8 through Merge Gateway

Route, observe, and control AI requests across providers from one API.