Jamba 1.5 Mini:
Everything you need to know about the model

Jamba 1.5 Mini is an AI21 Labs model available through Merge Gateway. Use it with Gateway routing policies, spend controls, request logs, and a 256,000 token context window. It supports streaming through at least one Gateway vendor route.

Jamba 1.5 Mini performance*

Intelligence - general reasoning and knowledge
8%
*Performance data is provided by Artificial Analysis and is subject to change.

Jamba 1.5 Mini pricing

| Vendor | Input / 1M tokens | Output / 1M tokens | Zero data retention | | --- | ---: | ---: | --- | | Amazon Bedrock | $0.2000 | $0.4000 | Yes |

Test Jamba 1.5 Mini with Merge Gateway’s Simulator

Jamba 1.5 Mini
Synced
Synced
Run simulation to see response

Ready to try it out?

Start routing requests to hundreds of large language models in your product within minutes.

Route requests to Jamba 1.5 Mini with Merge Gateway

Merge Gateway is a unified LLM API that lets your product route requests to Jamba 1.5 Mini and every other major model through a single endpoint. You get built-in fallback routing, per-request cost tracking, zero data retention support, and observability without changing your application architecture.
To get started in seconds, add our Gateway Implementation skill to your project, or pick your preferred SDK below. Check out our other quick start skills here.
Install the Merge Gateway SDK
Python
1$ pip install merge-gateway-sdk
Send a request
Python
1from merge_gateway import MergeGateway
2
3client = MergeGateway(api_key="YOUR_API_KEY")
4
5response = client.responses.create(
6    model="openai/gpt-5.2",
7    input=[
8        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
9        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
10    ],
11)
12
13print(response.output[0].content[0].text)
Try a diffrent model
Swap the model string to route to a different provider. No other code changes needed.
Anthropic
1response = client.responses.create(
2    model="anthropic/claude-sonnet-4-20250514",
3    input=[
4        {"type": "message", "role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"type": "message", "role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
Point to Gateway
Python
1from openai import OpenAI
2
3client = OpenAI(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/openai",
6)
Send a request
Use the standard chat.completions.create method. No provider prefix needed on the model name.
Python
1response = client.chat.completions.create(
2    model="gpt-5.2",
3    messages=[
4        {"role": "system", "content": "You are a helpful programming tutor. Explain the concepts clearly with practical examples."},
5        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
6    ],
7)
8
9print(response.choices[0].message.content)
Install packages
1npm install merge-gateway-ai-sdk-provider ai
Create the provider
TypeScript
1import { createMergeGateway } from "merge-gateway-ai-sdk-provider";
2
3const gateway = createMergeGateway({
4  apiKey: "YOUR_API_KEY",
5});
Send a request
Use generateText to send a request. Model names use the provider/model format.
TypeScript
1import { generateText } from "ai";
2
3const { text } = await generateText({
4  model: gateway("openai/gpt-4o"),
5  prompt: "Explain the concept of recursion in programming with a simple set of examples.",
6});
7
8console.log(text);
If you already have @ai-sdk/openai installed, point it at Gateway with a base URL change:
TypeScript
1import { createOpenAI } from "@ai-sdk/openai";
2
3const gateway = createOpenAI({
4  apiKey: "YOUR_API_KEY",
5  baseURL: "https://api-gateway.merge.dev/v1/ai-sdk",
6});
7
8// All generateText/streamText calls work unchanged
Install the Merge Gateway SDK
Anthropic SDK
1from anthropic import Anthropic
2
3client = Anthropic(
4    api_key="YOUR_API_KEY",
5    base_url="https://api-gateway.merge.dev/v1/anthropic",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-20250514",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Explain the concept of recursion in programming with a simple set of examples."},
13    ],
14)
15
16print(message.content[0].text)

Explore other models available in Merge Gateway

model logo
GPT-4.1 Mini
model logo
GPT-4.1 Nano
model logo
GPT-4o
model logo
GPT-4o Mini
model logo
GPT-4 Turbo
model logo
GPT-5
model logo
GPT-5.1
model logo
GPT-5.2
model logo
GPT-5.3
model logo
GPT-5.4
model logo
GPT-5.4 Mini
model logo
GPT-5.4 Nano
model logo
GPT-5.5
model logo
GPT-5 mini
model logo
GPT-5 Nano
model logo
Grok 3
model logo
Grok 3 Mini
model logo
Grok 4.1 Fast Non-Reasoning
model logo
Grok 4.1 Fast Reasoning
model logo
Grok 4.20 API
model logo
Grok 4.3
model logo
Grok 4 Fast Non-Reasoning
model logo
Grok 4 Fast Reasoning
model logo
Grok Code Fast 1

Jamba 1.5 Mini FAQ

If you have additional questions about Jamba 1.5 Mini, we've addressed several more below. Keep in mind that this information was written in June 2026 and may change over time.

Heading

What other models does AI21 Labs offer?

AI21 Labs builds the Jamba family of hybrid Mamba-Transformer models, with options spanning compact on-device builds to large-scale enterprise deployments. Here are some other models AI21 Labs supports:

  • Jamba 1.5 Large: It's the more capable sibling to Jamba 1.5 Mini, with 398 billion total parameters and 94 billion active. It's designed for tasks that require more reasoning depth, and is priced at $2.00 per 1M input tokens and $8.00 per 1M output tokens
  • Jamba 1.6 Mini: Released in March 2025, this is the direct successor to Jamba 1.5 Mini and shares the same 52 billion total / 12 billion active parameter count. It runs at approximately 180.7 tokens per second, making it one of the fastest models available, while maintaining the same $0.20 per 1M input / $0.40 per 1M output pricing
  • Jamba 1.6 Large: The updated flagship model from March 2025, offering the same large-scale parameter count as Jamba 1.5 Large but with improved throughput at around 58.6 tokens per second. It carries the same pricing as Jamba 1.5 Large but with better performance per dollar
  • Jamba2 Mini: A newer model positioned for core enterprise workflows, emphasizing reliability and steerability. It's available for download on Hugging Face and suited for teams that want to self-host AI21's latest generation
  • Jamba2 3B: A compact model built specifically for on-device applications and agentic workflows. It prioritizes a small deployment footprint over raw capability, making it relevant for edge or embedded use cases

How does Jamba 1.5 Mini differ from AI21 Labs's other models?

Jamba 1.5 Mini sits at the efficient, cost-optimized end of the 1.5 generation, built for workloads where throughput and price matter more than maximum reasoning depth.

  • Parameters: Jamba 1.5 Mini has 52 billion total parameters with 12 billion active during inference, versus Jamba 1.5 Large's 398 billion total and 94 billion active. This is the primary differentiator in capability between the two models
  • Pricing: Jamba 1.5 Mini costs $0.20 per 1M input tokens and $0.40 per 1M output tokens. Jamba 1.5 Large costs $2.00 per 1M input and $8.00 per 1M output, so Mini is approximately 10x cheaper on input and 20x cheaper on output for the same generation
  • Context window: Both models share a 256k token context window, so long-document processing is equally accessible at Mini's lower price point
  • Generation position: Jamba 1.5 Mini has been succeeded by Jamba 1.6 Mini, which offers faster output speeds while maintaining the same pricing and context window

Jamba 1.5 Mini is the right choice when your use case involves high-volume, lower-complexity requests and the 256k context window is a requirement, but spending at Jamba 1.5 Large's rate isn't justified.

What models should I consider using alongside Jamba 1.5 Mini?

No single model is optimal for every task. Here are models worth pairing with Jamba 1.5 Mini depending on what your product needs:

  • Jamba 1.5 Large: When a request requires more reasoning or instruction-following depth than Mini can handle reliably, routing to Jamba 1.5 Large gives you the same provider, same architecture, and the same 256k context window at a higher capability tier. This keeps your stack consistent while adding a quality escalation path
  • Claude Haiku 3.5: For tasks where you want strong instruction following at a low cost and don't need a 256k context window, Claude Haiku 3.5 is a well-benchmarked, cost-efficient alternative. It's a useful cross-provider fallback when AI21 Labs availability is a concern
  • Gemini 2.0 Flash: When speed is the primary constraint and you want a model with broader benchmark coverage, Gemini 2.0 Flash offers competitive throughput at a similar price tier. It's a practical pairing for latency-sensitive classification or extraction workloads
  • GPT-4o: For the subset of requests in your traffic that require strong reasoning, code generation, or multimodal input, GPT-4o provides significantly higher capability. Routing only complex requests there keeps overall costs manageable while not bottlenecking on Mini's intelligence ranking
  • Mistral Small: As another cost-efficient open-weight model, Mistral Small is worth pairing when you want a fast, low-cost fallback from a different provider. It rounds out a multi-provider routing setup without adding much complexity

What are the challenges of using Jamba 1.5 Mini in my product?

Like any production LLM, Jamba 1.5 Mini comes with tradeoffs worth planning for:

  • Intelligence ranking: Artificialanalysis.ai ranks Jamba 1.5 Mini 8th out of 39 comparable models on its Intelligence Index, placing it toward the lower end of evaluated models. For tasks requiring reliable reasoning, this may mean a higher error rate or more prompt engineering overhead than a higher-ranked model would require
  • Knowledge cutoff: Jamba 1.5 Mini has a training data cutoff of March 5, 2024. Products relying on it for current information will need retrieval augmentation, and the age of the cutoff means the gap between training data and current events is significant
  • Speed data availability: Unlike its successor Jamba 1.6 Mini, published throughput benchmarks for Jamba 1.5 Mini aren't widely available from independent sources. Planning latency-sensitive SLAs requires your own benchmarking against your actual workload
  • Provider dependency: Routing all requests through AI21 Labs creates a single point of failure. If AI21 Labs experiences an outage or deprecates this model version, there's no automatic fallback unless you've built one into your routing layer
  • Cost at scale: While Jamba 1.5 Mini is priced attractively per token, high request volumes or unexpectedly long outputs still compound quickly without active budget controls in place

Why should I use Merge Gateway to route LLM requests with Jamba 1.5 Mini and every other model?

Using Jamba 1.5 Mini through Merge Gateway gives you access to the model itself and the infrastructure layer around it:

  • One API, every provider: Access Jamba 1.5 Mini and every other major LLM through a single endpoint and API key. Change providers by swapping the model string — no application code changes required
  • Intelligent routing and automatic failover: Merge routes around AI21 Labs outages automatically. Routing policies based on cost, latency, or quality can reduce spend by 40-60% without touching your application code
  • Cost governance: Set hard or soft project budgets so Jamba 1.5 Mini spend stays within plan. Every request is attributed to a model, project, and tag in a unified billing dashboard across all providers
  • Build Your Own Router: Define what "best" means for your traffic by selecting from curated ML benchmarks or adding your own eval scores. The router scores each available model against your weights and picks the winner per request, with a plain-language explanation of every decision
  • Security and compliance controls: Apply DLP rules and prompt injection protection before every request reaches AI21 Labs. Enforce per-project model and region policies without adding that logic to your application

How can I start using Merge Gateway to route requests with Jamba 1.5 Mini?

Getting Jamba 1.5 Mini running through Merge Gateway takes a few minutes:

1. Create an account and get your API key from the dashboard.

2. Install the Merge Gateway SDK: run pip install merge-gateway-sdk (Python) or npm install merge-gateway-sdk (Node). Alternatively, if you're already using the OpenAI SDK, set base_url = "https://api-gateway.merge.dev/v1/openai" and your existing code works as-is.

3. Make your first request using the provider/model format. For Jamba 1.5 Mini, the model string is ai21/jamba-1.5-mini. Swap the model string to route to any other provider without changing anything else.

4. Configure a routing policy in the dashboard to set failover behavior, cost limits, and optimization strategy. Your first policy can be as simple as naming Jamba 1.5 Mini as primary with one fallback.

Full setup instructions and SDK references are in the Merge Gateway docs.

Try Jamba 1.5 Mini through Merge Gateway

Route, observe, and control AI requests across providers from one API.