Jamba 1.5 Large: pricing, performance, and how to route requests

Route requests to
Jamba 1.5 Large
with Merge Gateway

Apply your own routing policies, reduce token costs automatically, and see every routing decision in real time with Merge Gateway.

How Jamba 1.5 Large performs*

Intelligence - general reasoning and knowledge

*Performance data is provided by Artificial Analysis and is subject to change.

What Jamba 1.5 Large costs to run

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Amazon Bedrock | $2.00 | $8.00 | Yes |

Test Jamba 1.5 Large
with Gateway’s Simulator

See a prompt's output, token spend, latency, and more with Jamba 1.5 Large.

Route requests to Jamba 1.5 Large in minutes

To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.

Install the Merge Gateway SDK

Python

1$ pip install merge-gateway-sdk

Send a request

Python

1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)

Try a diffrent model

Swap the model string to route to a different provider. No other code changes needed.

Anthropic

1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)

Point to Gateway

Python

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)

Send a request

Use the standard chat.completions.create method. No provider prefix needed on the model name.

Python

1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)

Install packages

1npm install merge-gateway-ai-sdk-provider ai

Create the provider

TypeScript

1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});

Send a request

Use generateText to send a request. Model names use the provider/model format.

TypeScript

1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);

If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:

TypeScript

1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged

Install the Merge Gateway SDK

Anthropic SDK

1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

Amazon Nova 2 Lite

Amazon Nova 2 Sonic

Amazon Nova Lite

Amazon Nova Micro

Amazon Nova Premier

Amazon Nova Pro

Claude 3.7 Sonnet

Claude Haiku 4.5 (20251001)

Claude Opus 4.6

Claude Opus 4.7

Claude Opus 4.8

Claude Sonnet 4.5

Claude Sonnet 4.6

Claude Sonnet 5

Codestral

Codestral 25.08

Command R 08-2024

Command R+ 08-2024

Command R7B 12-2024

DeepSeek R1

DeepSeek V3

DeepSeek V3.2

DeepSeek V4 Flash

DeepSeek V4 Pro

Jamba 1.5 Large FAQ

For anyone with more questions about Jamba 1.5 Large, we've covered a few more below. The details here reflect what was known in June 2026 and are subject to change.

Heading

What other models does AI21 Labs offer?

AI21 Labs builds the Jamba family of models, all based on a hybrid Mamba-Transformer Mixture of Experts architecture. Here are some other models AI21 Labs supports:

Jamba 1.5 Mini: It's the smaller sibling to Jamba 1.5 Large, with 52 billion total parameters and 12 billion active. It's priced at $0.20 per 1M input tokens and $0.40 per 1M output tokens, making it a cost-efficient option for high-volume, lower-complexity tasks that still benefit from the 256k context window

Jamba 1.6 Large: Released in March 2025, this is the successor to Jamba 1.5 Large and shares the same 398 billion total / 94 billion active parameter count. It offers improved performance at the same $2.00 per 1M input / $8.00 per 1M output pricing, along with measurably faster output speeds of around 58.6 tokens per second

Jamba 1.6 Mini: The updated version of Jamba 1.5 Mini, released March 2025. It runs at 180.7 tokens per second, making it one of the fastest models available, while keeping the same low pricing tier as its predecessor

Jamba2 Mini: A more recent release positioned for core enterprise workflows. It emphasizes reliability and steerability, and is available for download on Hugging Face for private deployment scenarios

Jamba2 3B: A compact model designed for on-device applications and agentic workflows. It's built for environments where model size and deployment footprint matter more than raw capability

How does Jamba 1.5 Large differ from AI21 Labs's other models?

Jamba 1.5 Large is the highest-capability model in the 1.5 generation, designed for tasks that require more reasoning depth and long-context handling than Jamba 1.5 Mini can deliver.

Context window: Both Jamba 1.5 Large and Jamba 1.5 Mini support a 256k token context window, so context length isn't a differentiator within this generation

Parameters: Jamba 1.5 Large has 398 billion total parameters with 94 billion active during inference, compared to Jamba 1.5 Mini's 52 billion total and 12 billion active. This larger active parameter count gives it more capacity for complex tasks

Pricing: Jamba 1.5 Large costs $2.00 per 1M input tokens and $8.00 per 1M output tokens. Jamba 1.5 Mini costs $0.20 per 1M input and $0.40 per 1M output, meaning the Mini is approximately 10x cheaper on input and 20x cheaper on output

Generation position: Jamba 1.5 Large has since been succeeded by Jamba 1.6 Large, which offers similar pricing but faster throughput and improved benchmark performance

Jamba 1.5 Large is best suited for teams already integrated with the AI21 Labs API that need more headroom than Mini offers, particularly for long-document processing where its full parameter count can be put to use.

What models should I consider using alongside Jamba 1.5 Large?

No single model is optimal for every task. Here are models worth pairing with Jamba 1.5 Large depending on what your product needs:

Claude Sonnet 4.5: When your workload involves complex multi-step reasoning or nuanced instruction following, Claude Sonnet 4.5 delivers higher benchmark performance. It's a strong fallback for tasks where Jamba 1.5 Large's intelligence ranking leaves gaps

Gemini 2.0 Flash: For high-throughput production tasks where speed and cost matter more than top-tier reasoning, Gemini 2.0 Flash offers a favorable cost-per-token profile with fast output speeds. It's a practical routing target for simple classification or summarization requests

GPT-4o Mini: When you need reliable instruction following at a low price point and don't require a 256k context window, GPT-4o Mini is a cost-efficient alternative. It works well alongside Jamba 1.5 Large as a cheaper fallback for shorter, structured prompts

Llama 3.3 70B: If you're running open-weight models and want a more capable option than Jamba 1.5 Large for reasoning tasks, Llama 3.3 70B offers stronger benchmark scores in that parameter tier. It's a useful pairing for teams self-hosting part of their stack

Jamba 1.5 Mini: For the same long-context use cases where you can tolerate lower capability in exchange for 10-20x cost savings, routing simpler requests to Jamba 1.5 Mini keeps infrastructure consistent while reducing spend significantly

What are the challenges of using Jamba 1.5 Large in my product?

Like any production LLM, Jamba 1.5 Large comes with tradeoffs worth planning for:

Intelligence ranking: Artificialanalysis.ai ranks Jamba 1.5 Large 39th out of 43 comparable models on its Intelligence Index (as of 06/04/2026). For tasks requiring strong reasoning or coding ability, this is a meaningful constraint that may push you toward a different primary model

Cost relative to capability: At $2.00 per 1M input and $8.00 per 1M output tokens, the pricing is on the higher end for a model at this intelligence tier. You may find models ranked significantly higher available at comparable or lower price points

Knowledge cutoff: Jamba 1.5 Large has a training data cutoff of March 5, 2024. Any product relying on it for current events, recent documentation, or up-to-date facts will need retrieval augmentation to compensate

Provider dependency: Routing all requests through AI21 Labs creates a single point of failure. If AI21 Labs experiences an outage or deprecates this model version, there's no automatic fallback unless you've built one into your routing layer

Cost at scale: Output tokens are billed at $8.00 per million, which compounds quickly at high request volumes. Without active budget controls, a spike in traffic or response length can generate unexpected spend

Why should I use Merge Gateway to route LLM requests with Jamba 1.5 Large and every other model?

Using Jamba 1.5 Large through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

One API, every provider: Access Jamba 1.5 Large and every other major LLM through a single endpoint and API key. Change providers by swapping the model string — no application code changes required

Intelligent routing and automatic failover: Merge routes around AI21 Labs outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40-60% without touching your application code

Cost governance: Set hard or soft project budgets so Jamba 1.5 Large spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers

Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision

Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches AI21 Labs. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with Jamba 1.5 Large?

Getting Jamba 1.5 Large running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For Jamba 1.5 Large, the model string is ai21/jamba-1.5-large. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming Jamba 1.5 Large as primary with one fallback.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Jamba 1.5 Large through Merge Gateway

Route, observe, and control AI requests across providers from one API.

Start building for free

Get a demo

How Jamba 1.5 Large performs*

What Jamba 1.5 Large costs to run

Test Jamba 1.5 Largewith Gateway’s Simulator

See a prompt's output, token spend, latency, and more with Jamba 1.5 Large.

Route requests to Jamba 1.5 Large in minutes

Explore other models available in Merge Gateway

Jamba 1.5 Large FAQ

Heading

What other models does AI21 Labs offer?

How does Jamba 1.5 Large differ from AI21 Labs's other models?

What models should I consider using alongside Jamba 1.5 Large?

What are the challenges of using Jamba 1.5 Large in my product?

Why should I use Merge Gateway to route LLM requests with Jamba 1.5 Large and every other model?

How can I start using Merge Gateway to route requests with Jamba 1.5 Large?

Try Jamba 1.5 Large through Merge Gateway

Test Jamba 1.5 Large
with Gateway’s Simulator