Prerequisites
- An Auriko API key
- A session token (for routing defaults)
- Python 3.10+ with
aurikoSDK installed (pip install auriko)- OR Node.js 18+ with
@auriko/sdkinstalled (npm install @auriko/sdk)
- OR Node.js 18+ with
- Familiarity with Routing options
How routing works
When you send a request, Auriko’s router:- Enumerates candidates — finds all providers offering the requested model(s)
- Filters by constraints — removes providers that violate your routing options (data policy, Bring Your Own Key (BYOK) requirement, min success rate, excluded providers)
- Scores by strategy — ranks remaining candidates using your
optimizestrategy:cost/cheapest: lowest price per tokenttft/speed: lowest latency to first tokenthroughput: highest tokens per secondbalanced: weighted combination of cost, latency, and throughput
- Selects and routes — selects from the ranked list, favoring higher-scored providers
- Falls back if needed — if the provider fails and
allow_fallbacksis true, retries with the next candidate (up tomax_fallback_attempts)
Use suffix shortcuts
Append a suffix to any model name for quick routing configuration:| Suffix | Strategy | Effect |
|---|---|---|
:floor | cheapest | Absolute lowest cost |
:cost | cost | Cost-optimized with more spread |
:nitro | speed | Fastest overall provider |
:fast | ttft | Fastest time to first token |
:balanced | balanced | Weighted combination |
ft:gpt-4o:org:custom) pass through unchanged.
Route across models
Passmodels instead of model to route across multiple models (mutually exclusive with model, max 10):
| Mode | Behavior |
|---|---|
pool (default) | Select the best-scoring provider across all requested models |
fallback | Try all providers for the first model, then the second model, and so on |
Set quality constraints
Filter providers by performance requirements:| Constraint | Type | Description |
|---|---|---|
min_throughput_tps | number | Minimum tokens per second |
min_success_rate | number (0–1) | Minimum success rate |
max_cost_per_1m | number | Maximum cost per 1M tokens (average of input + output) |
max_ttft_ms | number | Maximum time to first token in milliseconds |
weights | object | Custom scoring weights: { cost, ttft, throughput, reliability }. Overrides preset. |
Data policy
Control how providers handle your data:| Policy | Description |
|---|---|
none (default) | No restrictions |
no_training | Provider must not use data for training |
zdr | Zero data retention — strictest policy |
zdr > no_training > none. When a per-request policy intersects with an account-level policy, the most restrictive one wins.
Provider alias normalization
Provider names inproviders and exclude_providers are case-insensitive and support aliases:
| Alias | Canonical name |
|---|---|
google, google_ai, googleai, gemini | google_ai_studio |
fireworks | fireworks_ai |
together | together_ai |
Configure fallbacks
By default, Auriko retries with alternative providers on 429 (rate limit), 5xx (server error), and timeout responses.| Setting | Default | Description |
|---|---|---|
allow_fallbacks | true | Enable automatic fallback to alternative providers |
max_fallback_attempts | 3 | Maximum fallback attempts (not counting the primary attempt) |
Set workspace defaults
Set default routing options for all requests in a workspace:Routing defaults use session authentication, not API key authentication. See Authentication.
{} as the PATCH body.
Per-request routing options override workspace defaults. Model suffix overrides sit between workspace defaults and per-request options.
Precedence
- Per-request
routingoptions (highest) - Model suffix overrides (for example,
:floor) - Workspace routing defaults (lowest)