Track changes to the Auriko API, SDKs, and supported models.Documentation Index
Fetch the complete documentation index at: https://docs.auriko.ai/llms.txt
Use this file to discover all available pages before exploring further.
0.2.0 — 2026-05-18
Routing & Providers
- Priority tier routing:
gateway.routing.tier: "priority"enables Anthropic fast mode (2.5x speed, 6x cost). Premium-tier offerings excluded from routing by default - Requests with
tool_choice="required"route only to providers that honor the constraint. Returnstool_choice_required_not_supportedif no capable provider is available - SiliconFlow-hosted models are routable, including 13 exclusive models and reasoning variants (DeepSeek-R1, Qwen3 thinking)
qwen-3.6-plusroutes correctly through Fireworks AI and Together AI- Fireworks AI no longer serves 9 open-weight Qwen 3 models. All remain routable through other providers
Parameters & Request Handling
- Default timeouts support reasoning models with longer first-token times.
deadline_msis opt-in only developermessage role works across all providers- Empty assistant messages (
content: null,"", or empty text blocks) route correctly to Anthropic
Model Compatibility
- Anthropic
response_formatsupports type arrays (e.g.,type: ["string", "null"]) with correct nullable enum and object handling - Zero-parameter tools validate correctly on Anthropic and Google models
reasoning_effortcoexists withtemperature,top_p, andtop_kon Claude models. Incompatible sampling parameters are stripped with a warningreasoning_effortnormalization across providers:- Claude:
"xhigh"preserved on Opus 4.7, clamped to"high"on Opus 4.5, mapped to"max"on other models - Gemini 3.x:
"xhigh"and"max"normalize to"high" - OpenAI and xAI:
"max"normalizes to each model’s ceiling - All normalizations include an
unsupported_parameterwarning
- Claude:
reasoning_effortwith extrememax_tokenson Gemini clamps to each model’s accepted range- DeepInfra Gemma 3, Qwen3 VL, and MiMo return correct tool call responses, streaming and non-streaming
- MiniMax models separate reasoning into
reasoning_contentcleanly output_tokensin Anthropic-format responses includes reasoning tokens for Gemini and xAIllama-3.1-8b-instruct-turboreturns a clean capability error for unsupported structured output
Error Handling
- Streaming errors return structured messages (
message,type,code,provider) instead of empty objects - Multi-provider fallback errors include per-provider failure details
Platform
- New signups receive a personalized welcome email
- Playground credit for new signups drops from 0.50
SDKs
- Python SDK
auriko0.2.0 - TypeScript SDK
@auriko/sdk0.2.0 - Vercel AI SDK provider
@auriko/ai-sdk-provider0.2.0
Integrations
- Kilo Code integration guide
0.1.1 — 2026-05-12
Parameters & Request Handling
- User-configurable timeouts via
gateway.routing.timeout_ms(per-attempt) andgateway.routing.deadline_ms(request deadline)- Streaming uses
timeout_msas time-to-first-byte - Defaults: 10s streaming first-byte / 120s streaming deadline; non-streaming 120s / 180s
- Streaming uses
reasoning_effortaccepts"xhigh"and"max"levels. Anthropic models use adaptive thinking with automatic effort-to-budget translation- Timeout errors include actionable guidance naming the configurable parameters
provider_timeoutwarning emitted when fallback succeeds after a timeout
timeout_msanddeadline_msapply to all execution paths includingallow_fallbacks: false
Error Handling
- Inference validation returns specific error codes —
missing_required_parameter,invalid_parameter_value,unknown_field,operation_not_allowed— for precise programmatic error handlinginvalid_requestis reserved for malformed request bodies. HTTP status codes unchanged
- Platform operations return
state_precondition_failed,duplicate_resource, andoperation_not_allowed. 405 responses returnmethod_not_allowed - New platform codes:
field_immutable,invalid_recovery_code,mfa_verification_failed
Model Compatibility
- Gemini tool calls return correct
finish_reason: tool_callsfor client-side tool execution - Gemini accepts non-object tool results (numbers, booleans, arrays)
- Multi-turn tool calling works correctly with the OpenAI Python SDK on all providers
reasoning_effortworks on Gemma-4, qwen-3-vl-thinking, and gpt-5.4+ (when combined with tools)- Explicit
thinking.budget_tokensrequests (e.g. from Claude Code) route correctly to models requiring adaptive thinking - Pricing values display with correct precision
SDKs
- Python SDK
auriko0.1.1 - TypeScript SDK
@auriko/sdk0.1.1 - Response models include
service_tier,annotations, and prediction token fields
Integrations
- OpenCode models.dev definitions (15 curated models)
- Claude Code integration guide
0.1.0 — 2026-05-04
API
- OpenAI-compatible chat completions with multi-provider routing and automatic failover
- Anthropic Messages API compatibility —
POST /v1/messagesand token counting - Streaming and non-streaming responses across all supported providers
- Model directory with pricing, capability metadata, and provider health status
- Workspace billing and credit balance (Preview)
- Structured reasoning support with cryptographic signatures for multi-turn round-trip
SDKs
- Python SDK
auriko0.1.0 — with OpenAI Agents SDK compatibility viaauriko[openai-compat] - TypeScript SDK
@auriko/sdk0.1.0 - Vercel AI SDK provider
@auriko/ai-sdk-provider0.1.0
Frameworks
- LangChain, LlamaIndex, CrewAI, Google ADK, OpenAI Agents SDK
- Claude Agent SDK via
ANTHROPIC_BASE_URL - Vercel AI SDK
Models
- GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3.1, Mistral Large, DeepSeek V4, and more — see Models for the complete list