Skip to main content
Use Auriko as your LLM backend for Claude Code and the Claude Agent SDK.

Prerequisites

Set up Claude Code

Add three environment variables to your shell profile (~/.zshrc or ~/.bashrc):
export ANTHROPIC_BASE_URL="https://api.auriko.ai"
export ANTHROPIC_AUTH_TOKEN="ak_live_..."  # from auriko.ai/dashboard
export ANTHROPIC_API_KEY=""
Then claude works. Every Claude Code session routes through Auriko.
ANTHROPIC_API_KEY="" must be explicitly set to empty. If it contains a value, Claude Code uses it directly against Anthropic, bypassing the gateway.

Set up Claude Agent SDK

The Claude Agent SDK spawns the Claude Code CLI as a subprocess. Pass the environment variables through ClaudeAgentOptions.env:
import os
from claude_agent_sdk import ClaudeAgentOptions, query

options = ClaudeAgentOptions(
    model="sonnet",
    system_prompt="You are a code review assistant.",
    allowed_tools=["Read", "Grep", "Glob"],
    env={
        "ANTHROPIC_BASE_URL": "https://api.auriko.ai",
        "ANTHROPIC_AUTH_TOKEN": os.environ["AURIKO_API_KEY"],
        "ANTHROPIC_API_KEY": "",
    },
)

async for message in query(prompt="Review main.py for bugs", options=options):
    print(message)
As defensive practice, pass setting_sources=[] in options to disable loading filesystem settings. This guarantees your env values take effect regardless of SDK version.

Use non-Claude models

Claude Code supports any model via Auriko’s routing:
claude --model gpt-4o
claude --model deepseek-chat
claude --model gemini-2.5-flash
Switch models mid-session with /model gpt-4o, or override default aliases:
export ANTHROPIC_DEFAULT_SONNET_MODEL="gpt-4o"
export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5"
Model names are the same canonical names you use with Auriko’s OpenAI endpoint.

Control costs and routing

Without Auriko, Claude Code talks directly to Anthropic. One provider, one price, no controls. With Auriko:
  • Cost visibility: see what Claude Code costs per session, per key, per team member
  • Budget controls: cap spend per API key so a runaway agentic loop doesn’t burn the budget
  • Provider routing: same Claude model available through Anthropic direct, Bedrock, and Vertex; Auriko picks the cheapest or fastest based on workspace config
  • Fallbacks: if Anthropic direct is rate-limited or down, automatically fail over to Bedrock or Vertex
  • Data policies: enforce zero data retention (ZDR) or no-training across all Claude Code usage
  • Unified billing: Claude Code usage in the same dashboard as all other LLM usage
  • Non-Claude models: claude --model gpt-4o routes through Auriko to OpenAI

Use the Anthropic SDK directly

Any application built on the Anthropic SDK can use Auriko as a drop-in backend:
import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.auriko.ai",
    api_key=os.environ["AURIKO_API_KEY"],
)

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)

Configure routing

Pass routing options through extra_body:
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateway": {
            "models": ["claude-sonnet-4-6", "claude-opus-4-6"],
            "routing": {"optimize": "cost"},
        },
    },
)
For Claude Code, configure routing at the workspace level via the Auriko dashboard. The CLI doesn’t support per-request routing options.

Check compatibility

  • All Claude Code features work through Auriko: tool use (Bash, Read, Write, Edit), MCP servers, streaming, extended thinking, and prompt caching
  • Agentic features (file operations, bash execution, MCP) run locally in the CLI; only LLM inference calls route through Auriko
  • Non-Claude models work for simple tasks but degrade on complex multi-file refactors (this is inherent to Claude Code, not Auriko-specific):
CapabilityClaudeGPT-4oDeepSeek/GeminiLocal models
Tool callingExcellentGoodSimple calls workUnreliable
Edit accuracy~98%~85-90%~60-80%Worst
Long sessionsStableDegradesDegrades fasterDegrades fastest
  • Fast mode’s org eligibility check is hardcoded to https://api.anthropic.com and doesn’t respect ANTHROPIC_BASE_URL. Fast mode is silently disabled when routing through a gateway.