OpenRouter Model Stacks for AI Assistants

A model stack is a curated set of AI models that your OpenClaw assistant routes across intelligently. Instead of locking you to one model, the stack hands every message to OpenRouter’s auto-router, which picks the best model from the set for that specific request, a hard coding question might go to the strongest model, a quick reply to a faster, cheaper one. Each stack also has a built-in cost-vs-quality lean, so the routing matches the stack’s intent.

Stacks at a glance

Wanderer

Free models only. Explore without spending a credit.

Gemma 4 26B

Google

North Mini Code

Cohere

Nemotron 3 Ultra 550B

Nvidia

Hustler

Efficient paid models. Quality stays high, spend stays low.

Gemini Flash Lite

Google

GPT-5 Nano

OpenAI

Hy3

Tencent

Professional Default

Balanced quality and speed for everyday work.

Claude Haiku 4.5

Anthropic

GPT-5.6 Luna

OpenAI

GPT-5.6 Terra

OpenAI

Qwen 3.7 Max

Qwen

Operator

The best models available. For when quality is everything.

Claude Sonnet 4.6

Anthropic

Claude Opus 4.8

Anthropic

Grok Build 0.1

xAI

GPT-5.6 Sol

OpenAI

How we choose models

Every stack has different criteria. We cross-reference PinchBench (real-world task success rates and cost-per-run value scores) with OpenRouter (live pricing, throughput, availability) before making any changes.

Stack	Capability bar	Cost bar
Wanderer	Solid task completion	Must be `:free` on OpenRouter
Hustler	High success rate, strong value score	Lowest cost-per-run with acceptable quality
Professional	High success rate, fast throughput	Mid cost, latency matters here
Operator	Best available, leads on reasoning	Cost is secondary

Model details and benchmarks

Wanderer Free

North Mini Code

Cohere

Primary

256K tokens Free

CX23-safe free coding model with tool support. New to the stack, so availability and real-world quality are being monitored.

Nemotron 3 Ultra 550B

Nvidia

Fallback 1

1M tokens Free

PinchBench avg score 89.9

Source: pinchbench.com ?,

Best-scoring free model on PinchBench. Single endpoint with variable uptime, so it backs up rather than leads.

Gemma 4 26B

Google

Fallback 2

262K tokens Free

PinchBench value score 173.1

Source: pinchbench.com ?,

Swapped in from the 31B variant (Model Scout, Jul 2026) — 2 providers instead of 1, best value score on PinchBench. Also handles image messages for this stack.

Hustler Low cost

Gemini 2.5 Flash Lite

Google

Primary

1M tokens $0.10 / $0.40 per M

PinchBench value score 367

Source: pinchbench.com ?,

Optimised for ultra-low latency and cost efficiency. Built-in reasoning via API.

Hy3

Tencent

Fallback 1

262K tokens $0.14 / $0.58 per M Jul 2026

New paid saver candidate with three providers, configurable reasoning, tool use, and a safe 262K/131K context-output profile.

GPT-5 Nano

OpenAI

Fallback 2

128K tokens $0.15 / $0.60 per M

Reliable OpenAI fallback. Consistent availability and broad task coverage.

Professional Mid cost

GPT-5.6 Luna

OpenAI

Primary

1M tokens $1 / $6 per M Jul 2026

PinchBench success rate 88.7

Source: pinchbench.com ?,

OpenAI's cost-sensitive GPT-5.6 tier. Five live endpoints and a safe 1M/128K context-output profile.

Claude Haiku 4.5

Anthropic

Fallback 1

200K tokens $1 / $5 per M

SWE-bench Verified 73

Source: openrouter.ai ?,

Near-frontier intelligence at low latency. Matches Sonnet 4 on reasoning, coding, and computer use.

GPT-5.6 Terra

OpenAI

Fallback 2

1M tokens $2.50 / $15 per M Jul 2026

PinchBench success rate 89.7

Source: pinchbench.com ?,

OpenAI's balanced GPT-5.6 tier for everyday coding, reasoning, and agentic work.

Qwen 3.7 Max

Qwen

Fallback 3

1M tokens $1.25 / $3.75 per M May 2026

PinchBench avg score 92.5

Source: pinchbench.com ?,

Excellent OpenClaw benchmark performance and value. Single-provider availability is covered by the stack's outer fallback.

Operator Premium

Claude Sonnet 4.6

Anthropic

Primary

1M tokens $3 / $15 per M Feb 2026

SWE-bench Verified 72.7

Source: openrouter.ai ?,

Frontier performance across coding, agents, and professional work. Leads on instruction-following and complex reasoning.

Claude Opus 4.8

Anthropic

Fallback 1

1M tokens $5 / $25 per M

PinchBench avg score 90.5

Source: pinchbench.com ?,

PinchBench top-tier model with seven endpoints. Every live provider reports a safe 128K max output on its 1M context.

GPT-5.6 Sol

OpenAI

Fallback 2

1M tokens $5 / $30 per M Jul 2026

OpenAI's current flagship for complex professional work, coding, and agentic reasoning.

Grok Build 0.1

xAI

Fallback 3

256K tokens $1 / $2 per M

PinchBench success rate 88.9

Source: pinchbench.com ?,

Added Jul 2026 (Model Scout) — xAI's first entry in our stacks. Coding/agent-focused, 4 xAI-operated endpoints.

How auto-routing works

For each message, OpenRouter’s auto-router looks at the models in your stack and picks the best fit for that request, weighing capability against cost using the stack’s lean (Operator leans hard toward quality; Wanderer and Hustler lean toward savings; Professional sits in the middle). Harder tasks pull a stronger model; simple ones get a faster, cheaper one. You don’t configure anything, it happens per message, automatically, within the set you’ve chosen. If the router itself ever has a hiccup, a reliable safety model in the stack catches it, so there’s no downtime.

Credit usage

Costs vary by model and message length. Rough per-message estimates:

Stack	Typical cost per message
Wanderer	$0
Hustler	$0.001–$0.005
Professional	$0.003–$0.015
Operator	$0.015–$0.08

Check your Dashboard usage tab to monitor actual spend.

Switching stacks

Open your Dashboard, go to Settings, and select a new stack. The change applies immediately, no redeploy needed. You can also ask your bot to use any individual model for a specific conversation without changing your stack.

Model compatibility

Stack models are tested for compatibility with the CX23 VPS before any changes are made. If you use the custom model picker, be aware that some newer 1M context models have a token specification issue that causes immediate overflow errors. See VPS specs and model limits for the full details.