Python SDK Reference
See the Python SDK Guide for usage examples and getting started.
Client
Initialize a client with configuration options:
from auriko import Client, AsyncClient
client = Client(
api_key="sk_ia_...", # or AURIKO_API_KEY env var
base_url="https://api.auriko.ai/v1", # default
timeout=60.0, # seconds, default 60
max_retries=2, # default 2 (0 disables)
)
Resources
| Resource | Methods |
|---|
client.chat.completions | create(...) |
client.models | list_directory(), list_registry(), list_providers() |
client.workspaces | list(), get(workspace_id) |
client.budgets | list(workspace_id), get(workspace_id, budget_id) |
client.me | get() |
All resources are available on both Client (sync) and AsyncClient (async).
Chat Completions
client.chat.completions.create(...)
Creates a chat completion. Supports single-model and multi-model routing.
# Non-streaming
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=100,
)
# Streaming
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
Parameters
| Parameter | Type | Required | Description |
|---|
messages | list[dict] | Yes | Conversation messages (non-empty) |
model | str | One of model/models | Model ID |
models | list[str] | One of model/models | Model IDs for multi-model routing |
stream | bool | No | Enable streaming (default: False) |
temperature | float | No | Sampling temperature (0–2) |
max_tokens | int | No | Max tokens to generate |
max_completion_tokens | int | No | Max completion tokens (alias for max_tokens) |
top_p | float | No | Nucleus sampling (0–1) |
frequency_penalty | float | No | Frequency penalty (-2 to 2) |
presence_penalty | float | No | Presence penalty (-2 to 2) |
stop | str | list[str] | No | Stop sequences |
seed | int | No | Deterministic sampling seed |
n | int | No | Number of completions to generate |
tools | list[dict] | No | Function calling tool definitions |
tool_choice | str | dict | No | Tool selection: "auto", "none", "required", or function spec |
parallel_tool_calls | bool | No | Allow parallel function calls |
response_format | dict | No | Output format (e.g., {"type": "json_object"}) |
stream_options | dict | No | Stream options (e.g., {"include_usage": True}) |
logprobs | bool | No | Return log probabilities |
top_logprobs | int | No | Number of top logprobs per token (0–20) |
logit_bias | dict[str, float] | No | Token bias adjustments |
user | str | No | End-user identifier |
routing | RoutingOptions | dict | No | Routing configuration |
extensions | Extensions | dict | No | Provider-specific extensions (thinking, passthrough) |
auriko_metadata | dict | No | Request metadata (logged, visible in dashboard) |
extra_body | dict | No | Additional body fields (merged last, except stream) |
Response (non-streaming)
class ChatCompletion:
id: str
created: int
model: str
object: str # "chat.completion"
system_fingerprint: Optional[str]
choices: list[Choice]
usage: Optional[Usage]
routing_metadata: Optional[RoutingMetadata]
response_headers: Optional[ResponseHeaders]
class ChoiceMessage:
role: str
content: Optional[str]
reasoning_content: Optional[str] # populated by Anthropic, DeepSeek, Google, Fireworks AI
tool_calls: Optional[list[ToolCall]]
Response (streaming)
Returns a Stream that yields ChatCompletionChunk objects.
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in stream:
chunk.choices[0].delta.content # incremental content
chunk.choices[0].delta.reasoning_content # incremental reasoning (if enabled)
stream.usage # available after iteration
stream.routing_metadata # available after iteration
stream.response_headers # available immediately
stream.close() # manual cleanup (or use context manager)
Models
Query the model catalog:
directory = client.models.list_directory() # GET /v1/directory/models
registry = client.models.list_registry() # GET /v1/registry/models
providers = client.models.list_providers() # GET /v1/registry/providers
Workspaces
List and retrieve workspaces:
workspaces = client.workspaces.list() # GET /v1/workspaces
workspace = client.workspaces.get("ws-123") # GET /v1/workspaces/{id}
Budgets
List and retrieve budgets:
budgets = client.budgets.list("ws-123") # GET /v1/workspaces/{id}/budgets
budget = client.budgets.get("ws-123", "budget-456") # GET /v1/workspaces/{id}/budgets/{bid}
Identity
Get current API key identity:
identity = client.me.get() # GET /v1/me
# Returns: ApiKeyIdentity { object, user_id, workspace_id, tier, rate_limit_rpm }
Error Classes
All errors extend AurikoAPIError:
| Error Class | HTTP Status | Error Code |
|---|
AuthenticationError | 401 | invalid_api_key |
RateLimitError | 429 | rate_limit_exceeded |
InsufficientCreditsError | 402 | insufficient_quota |
BudgetExceededError | 402 | budget_exceeded |
ModelNotFoundError | 404 | model_not_found |
InvalidRequestError | 400 | invalid_request |
ProviderError | 502 | provider_error |
ProviderAuthError | 401 | provider_auth_error |
ServiceUnavailableError | 503 | service_unavailable |
InternalError | 500 | internal_error |
AurikoAPIError Fields
| Field | Type | Description |
|---|
message | str | Human-readable error description |
status_code | int | HTTP status code |
code | str | Machine-readable error code |
type | Optional[str] | Error type category |
param | Optional[str] | Parameter that caused the error |
body | Any | Raw response body |
response_headers | Optional[ResponseHeaders] | Response headers from the failed request |
from auriko import RateLimitError, AuthenticationError
try:
client.chat.completions.create(...)
except RateLimitError as e:
print(e.response_headers.rate_limit_reset)
except AuthenticationError as e:
print(e.message)
Providers may return additional error codes beyond those listed above. Always handle the base AurikoAPIError as a catch-all.
Available on ChatCompletion.response_headers and Stream.response_headers:
response.response_headers.request_id # X-Request-ID
response.response_headers.rate_limit_remaining # X-RateLimit-Remaining-Requests
response.response_headers.rate_limit_limit # X-RateLimit-Limit-Requests
response.response_headers.rate_limit_reset # X-RateLimit-Reset-Requests
response.response_headers.credits_balance_microdollars # X-Credits-Balance-Microdollars
response.response_headers.provider_used # X-Provider-Used
response.response_headers.routing_strategy # X-Routing-Strategy
response.response_headers.get("x-custom-header") # any header by name
Types
Client & Stream
from auriko import Client, AsyncClient
from auriko._streaming import Stream, AsyncStream # internal module; type-annotation use only
Chat Response Types
from auriko.models.chat import (
ChatCompletion, ChatCompletionChunk, Choice, ChoiceMessage,
StreamChoice, Delta, ToolCall, ToolCallFunction,
ToolCallDelta, ToolCallDeltaFunction,
)
Common Types
from auriko.models.common import Usage, PromptTokensDetails, CompletionTokensDetails, ApiKeyIdentity
Routing Types
from auriko.route_types import (
RoutingOptions, RoutingMetadata, CostInfo, FallbackChainEntry,
Optimize, Mode, DataPolicy,
)
Extensions
from auriko.models.extensions import Extensions, ThinkingConfig
| Field | Type | Description |
|---|
thinking | ThinkingConfig | Extended thinking configuration (enabled, budget_tokens) |
anthropic | dict | Anthropic-specific parameters |
openai | dict | OpenAI-specific parameters |
google | dict | Google-specific parameters |
deepseek | dict | DeepSeek-specific parameters |
[key] | dict | Arbitrary provider passthrough |
Model Discovery Types
from auriko.models.providers import (
ModelsListResponse, CanonicalModel,
DirectoryResponse, DirectoryModel, ProviderEntry, TierEntry,
ProviderList, ProviderInfo,
)
Workspace & Budget Types
from auriko.models.workspaces import Workspace, WorkspaceList
from auriko.models.budgets import Budget, BudgetList, Period, ScopeType
Error Classes
from auriko import (
AurikoAPIError, AuthenticationError, RateLimitError,
InsufficientCreditsError, BudgetExceededError, ModelNotFoundError,
InvalidRequestError, ProviderError, ProviderAuthError,
ServiceUnavailableError, InternalError,
)
Utilities
from auriko import ResponseHeaders, map_openai_error
from auriko.route_types import parse_routing_metadata
parse_routing_metadata(response) extracts RoutingMetadata from an OpenAI SDK response’s model_extra. Returns None if absent or unparseable.
This utility is Python-only. The Auriko SDK’s ChatCompletion type exposes routing_metadata as a typed property directly.