Enumerates candidates — finds all providers offering the requested model(s)
Filters by constraints — removes providers that violate your routing options (data policy, Bring Your Own Key (BYOK) requirement, performance constraints, excluded providers)
Scores by strategy — ranks remaining candidates using your optimize strategy:
cost: Cost-optimized, well-rounded
cost-focus: Aggressively minimize cost (default)
ttft: TTFT-optimized, well-rounded
ttft-focus: Aggressively minimize time to first token
tps: Throughput-optimized, well-rounded
tps-focus: Aggressively maximize tokens per second
balanced: All dimensions weighted evenly
Selects and routes — selects from the ranked list, favoring higher-scored providers
Falls back if needed — if the provider fails and allow_fallbacks is true, retries with the next candidate (up to max_fallback_attempts)
The router parses suffixes only when the model ID contains exactly one colon. Fine-tuned models with multiple colons (for example, ft:gpt-4o:org:custom) pass through unchanged.
Constraint ceilings (max_ttft_ms, min_throughput_tps, max_cost_per_1m) evaluate against median (p50) metrics. To rank providers by worst-case (p95) TTFT or throughput for scoring, set ttft_percentile or throughput_percentile — see Choose metric percentile.
Not all providers support every optional parameter. By default, Auriko drops unsupported parameters and adds a warning to the response.Set require_parameters to true to only route to providers that accept the optional parameters you sent:
The following parameters have per-provider support. When you set require_parameters to true, Auriko checks that your provider supports each one you sent: temperature, top_p, seed, logit_bias, logprobs, top_logprobs, n, presence_penalty, frequency_penalty, user, parallel_tool_calls, web_search_options, verbosity, prompt_cache_key, safety_identifier.require_parameters composes with other constraints. A provider must pass all filters to be eligible. You can check which parameters each provider supports via the model directory endpoint, where each provider entry includes accepted_params and supported_parameters fields.
By default, Auriko scores providers using median (p50) metrics. You can switch to 95th-percentile (worst-case) independently for TTFT and throughput:
Field
Controls
Default
ttft_percentile
TTFT scoring (which providers rank higher)
p50
throughput_percentile
Throughput scoring (which providers rank higher)
p50
Both scoring fields accept "p50" (median) or "p95" (worst-case). They work with presets and custom weights — no weights required.Example — rank providers by worst-case (p95) TTFT instead of median:
Auriko’s “priority” tier refers to Anthropic Fast Mode, not Anthropic’s separate Priority Tier (committed capacity SLA). See Routing options for details.
By default, Auriko retries with alternative providers on 429 (rate limit), 5xx (server error), and timeout responses.
Setting
Default
Description
allow_fallbacks
true
Enable automatic fallback to alternative providers
max_fallback_attempts
19
Safety ceiling on fallback attempts beyond the primary, range 1-19 (chain length 20 total)
timeout_ms
20000 (streaming) / 180000 (non-streaming)
Per-attempt timeout in milliseconds. For streaming: time to first byte. For non-streaming: time to complete response
deadline_ms
None (streaming) / 540000 (non-streaming)
Hard wall-clock cap across all fallback attempts. Non-streaming requests default to 9 minutes; streaming has no default (connections are long-lived). Set explicitly to override
You can configure per-attempt timeouts with timeout_ms. To set a hard wall-clock cap across all fallback attempts, use deadline_ms. Non-streaming requests have a 9-minute default deadline; streaming requests have no deadline (opt-in only). If the deadline is exceeded, the request fails with a timeout error.