API Overview - Auriko

The Auriko API is OpenAI-compatible, meaning you can use it as a drop-in replacement for OpenAI’s API.

Base URL

https://api.auriko.ai/v1

Endpoints

Inference API

Endpoint	Method	Description
`/v1/chat/completions`	POST	Create a chat completion
`/v1/me`	GET	Get API key identity

Discovery API

Endpoint	Method	Description
`/v1/registry/providers`	GET	List available providers
`/v1/registry/models`	GET	List canonical models
`/v1/directory/models`	GET	Full model directory with pricing

Management API

Resource	Endpoints	Description
Workspaces	4	Create, list, get, update workspaces
Routing	2	Get and update workspace routing defaults
Budgets	5	CRUD for spending limits
API Keys	4	Create, list, revoke keys; usage stats
Billing	1	Credit balance
Provider Keys	7	BYOK key management

Management API endpoints use session authentication. Workspace and budget reads also accept API keys. See Authentication.

OpenAI Compatibility

Auriko supports the same request/response format as OpenAI. Existing code using OpenAI’s API can switch to Auriko by changing:

Base URL: https://api.openai.com/v1 → https://api.auriko.ai/v1
API Key: Use your Auriko API key (starts with ak_)

Auriko Extensions

In addition to OpenAI-compatible fields, Auriko responses carry:

routing_metadata - Information about which provider handled the request
routing (request) - Options to optimize for cost, speed, or throughput
auriko_metadata (request) - Custom tags and trace IDs for request tracking

Response Extensions

Every chat completion response carries a routing_metadata object:

{
  "routing_metadata": {
    "provider": "openai",
    "provider_model_id": "gpt-5.4",
    "tier": "standard",
    "model_canonical": "gpt-5.4",
    "routing_strategy": "balanced",
    "total_latency_ms": 1234,
    "ttft_ms": 312,
    "candidates_total": 5,
    "candidates_viable": 3,
    "routing_decision_ms": 2.1,
    "cost": {
      "input_tokens": 100,
      "output_tokens": 50,
      "provider_cost_usd": 0.000135,
      "billable_cost_usd": 0.00015
    }
  }
}

Field reference

Field	Type	Always present	Description
`provider`	string	yes	Provider that served the request (e.g., `openai`, `anthropic`)
`provider_model_id`	string	yes	Provider’s internal model ID
`tier`	string	no	Pricing tier when applicable (e.g., `flex`, `standard`). Omitted for providers without tiers.
`model_canonical`	string	yes	Canonical model ID from the request
`routing_strategy`	string	yes	Strategy used: `cost`, `speed`, `balanced`, `ttft`, `throughput`, `cheapest`, or `custom`. `custom` is returned when explicit `routing.weights` are provided.
`candidates_total`	number	yes	Total provider candidates before filtering
`candidates_viable`	number	yes	Candidates remaining after constraint filtering
`routing_decision_ms`	number	yes	Time spent on the routing decision (ms)
`ttft_ms`	number	no	Time to first token (ms). Streaming responses only.
`total_latency_ms`	number	yes	Total request latency (ms)
`cost`	object	no	Cost breakdown. Present when token usage is available and pricing is token-based.
`fallback_chain`	array	no	Fallback attempt history. Present only when the primary provider failed and a fallback was used.
`warnings`	string[]	no	Warnings about ignored or unsupported routing configuration. Omitted when empty.

cost fields: input_tokens (number), output_tokens (number), provider_cost_usd (number), billable_cost_usd (number). fallback_chain entries: provider (string), status ("success" or "failed"), reason (string, present on failed entries only).

Request Extensions

Pass routing options to optimize your requests:

{
  "model": "gpt-5.4",
  "messages": [...],
  "routing": {
    "optimize": "cost",
    "max_ttft_ms": 200
  }
}

Request metadata

Attach custom metadata to requests for tracking and observability. Auriko strips auriko_metadata before forwarding to the provider. The auriko_ prefix avoids collision with OpenAI/Anthropic native metadata fields.

{
  "model": "gpt-5.4",
  "messages": [...],
  "auriko_metadata": {
    "tags": ["production", "chatbot"],
    "user_id": "user_abc123",
    "trace_id": "trace_xyz789",
    "custom_fields": {
      "environment": "production",
      "feature": "customer-support"
    }
  }
}

Field	Type	Limits
`tags`	`string[]`	Max 100 tags, each max 50 chars
`user_id`	`string`	Max 255 chars
`trace_id`	`string`	Max 255 chars
`custom_fields`	`Record<string, string>`	Max 10 fields, keys max 50 chars, values max 200 chars

Response headers

Every response carries custom headers organized by category.

Request tracing

Header	Description
`X-Request-ID`	Unique request identifier for support and debugging

Routing headers

Header	Description
`X-Provider-Used`	Provider that served the request
`X-Model-Requested`	Model ID from the original request
`X-Model-Canonical`	Canonical model ID after alias resolution
`X-Model-Used`	Actual model ID sent to the provider
`X-Routing-Strategy`	Strategy used (`cost`, `speed`, `balanced`, etc.)
`X-Routing-Time-Ms`	Time spent on routing decision
`X-Api-Key-Source`	Key type used (`platform` or `byok`)
`X-Multi-Model-Count`	Number of models in a multi-model request

Fallback headers

Header	Description
`X-Fallback-Enabled`	Whether fallback is enabled for this request
`X-Fallback-Used`	Whether a fallback was triggered
`X-Fallback-Depth`	Number of fallback attempts made
`X-Fallback-Original-Provider`	Provider of the first (failed) attempt
`X-Fallback-Attempted-Providers`	Comma-separated list of all attempted providers
`X-Fallback-Reason`	Reason the primary provider failed
`X-Fallback-Total-Time-Ms`	Total time across all attempts
`X-Fallback-Max-Attempts`	Maximum fallback attempts configured

Error diagnostic headers

Header	Description
`X-Error-Provider`	Provider that returned the error
`X-Error-Type`	Error classification
`X-Error-Retryable`	Whether the error is safe to retry

Billing headers

Header	Description
`X-Credits-Balance-Microdollars`	Current credit balance in microdollars
`X-Credits-Tier`	Current billing tier

Per-key rate limit headers are covered in Rate limits.

Budget headers

Returned when budgets are configured. Error headers appear on 402 responses when a budget is exceeded. Spend and limit headers appear on successful responses.

Header	Description
`X-Budget-Exceeded`	`true` when the request was rejected for exceeding a budget (402 only)
`X-Budget-Exceeded-Period`	Budget period that was exceeded: `daily`, `weekly`, or `monthly` (402 only)
`X-Budget-Exceeded-Scope`	Scope of the exceeded budget: `workspace`, `api_key`, or `byok_provider` (402 only)
`X-Budget-Daily-Spend`	Current daily spend in USD
`X-Budget-Daily-Limit`	Configured daily budget limit in USD
`X-Budget-Weekly-Spend`	Current weekly spend in USD
`X-Budget-Weekly-Limit`	Configured weekly budget limit in USD
`X-Budget-Monthly-Spend`	Current monthly spend in USD
`X-Budget-Monthly-Limit`	Configured monthly budget limit in USD

Spend and limit headers appear only for budget periods that are configured. If a workspace has only a monthly budget, daily and weekly headers are omitted.

Cache headers

Header	Description
`X-Cache-Savings-Percent`	Percentage of tokens saved via prompt caching

Authentication

Learn how to authenticate your requests

API Reference

​Base URL

​Endpoints

​Inference API

​Discovery API

​Management API

​OpenAI Compatibility

​Auriko Extensions

​Response Extensions

​Field reference

​Request Extensions

​Request metadata

​Response headers

​Request tracing

​Routing headers

​Fallback headers

​Error diagnostic headers

​Billing headers

​Budget headers

​Cache headers

Authentication

Base URL

Endpoints

Inference API

Discovery API

Management API

OpenAI Compatibility

Auriko Extensions

Response Extensions

Field reference

Request Extensions

Request metadata

Response headers

Request tracing

Routing headers

Fallback headers

Error diagnostic headers

Billing headers

Budget headers

Cache headers