Documentation Index
Fetch the complete documentation index at: https://docs.auriko.ai/llms.txt
Use this file to discover all available pages before exploring further.
TypeScript SDK Reference
See the TypeScript SDK Guide for usage examples and getting started.
Client
Initialize a client with configuration options:
import { Client } from "@auriko/sdk";
const client = new Client({
apiKey: "ak_...", // or AURIKO_API_KEY env var
baseUrl: "https://api.auriko.ai/v1", // default
timeout: 60_000, // ms, default 60s
maxRetries: 2, // default 2 (0 disables)
});
Resources
| Resource | Methods |
|---|
client.chat.completions | create(params) |
client.responses | create(params) |
client.models | list(), retrieve(modelId), listDirectory(), listRegistry(), listProviders() |
client.me | get() |
Chat Completions
client.chat.completions.create(params)
Creates a chat completion. Supports single-model and multi-model routing.
// Non-streaming
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
max_tokens: 100,
});
// Streaming
const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
stream: true,
});
Parameters
| Parameter | Type | Required | Description |
|---|
messages | Array<Message> | Yes | Conversation messages (non-empty) |
model | string | One of model/gateway.models | Model ID |
stream | boolean | No | Enable streaming (default: false) |
temperature | number | No | Sampling temperature (0–2) |
max_tokens | number | No | Max tokens to generate |
max_completion_tokens | number | No | Max completion tokens (alias for max_tokens) |
reasoning_effort | 'low' | 'medium' | 'high' | 'xhigh' | 'max' | 'off' | No | Reasoning effort for supported models — translated to provider-native control (see guide) |
top_p | number | No | Nucleus sampling (0–1) |
frequency_penalty | number | No | Frequency penalty (-2 to 2) |
presence_penalty | number | No | Presence penalty (-2 to 2) |
top_k | number | No | Top-K sampling |
min_p | number | No | Min-P sampling (0–1) |
top_a | number | No | Top-A sampling (0–1) |
repetition_penalty | number | No | Repetition penalty |
stop | string | string[] | No | Stop sequences |
seed | number | No | Deterministic sampling seed |
n | number | No | Number of completions to generate |
tools | Tool[] | No | Function calling tool definitions |
tool_choice | string | object | No | Tool selection: "auto", "none", "required", or function spec |
parallel_tool_calls | boolean | No | Allow parallel function calls |
response_format | object | No | Output format (e.g., { type: "json_object" }) |
stream_options | object | No | Stream options (e.g., { include_usage: true }) |
logprobs | boolean | No | Return log probabilities |
top_logprobs | number | No | Number of top logprobs per token (0–20) |
logit_bias | Record<string, number> | No | Token bias adjustments |
user | string | No | End-user identifier |
gateway | GatewayOptions | Record<string, unknown> | No | Gateway directives: routing, metadata, models |
extensions | Extensions | Record<string, unknown> | No | Provider-specific extensions (provider passthrough) |
extra_body | Record<string, unknown> | No | Additional body fields (merged last except stream; gateway-aware one-level-deep merge on gateway) |
| Field | Type | Description |
|---|
tags | string[] | Tags for categorizing requests (max 100 items, each ≤50 chars) |
user_id | string | Your application’s user identifier for per-user analytics (max 255 chars) |
trace_id | string | Distributed tracing identifier (max 255 chars) |
custom_fields | Record<string, string> | Arbitrary key-value pairs (max 10 keys, keys ≤50 chars, values ≤200 chars) |
import { Client } from "@auriko/sdk";
const client = new Client();
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
gateway: {
metadata: {
user_id: "user_123",
trace_id: "req-abc",
custom_fields: { env: "prod", team: "backend" },
},
},
});
Only the four fields above are accepted. Use custom_fields for arbitrary key-value pairs.
Response (non-streaming)
interface ChatCompletion {
id: string;
created: number;
model: string;
object: "chat.completion";
system_fingerprint?: string; // not all models include this
choices: Choice[];
usage?: Usage;
routing_metadata?: RoutingMetadata;
service_tier?: string | null; // processing tier (OpenAI-routed models)
responseHeaders: ResponseHeaders; // SDK-added
}
interface ChoiceMessage {
role: string;
content: string | null;
reasoning_content?: string; // chain-of-thought text (plain string)
reasoning?: ReasoningBlock[]; // structured reasoning blocks with signatures
refusal?: string | null; // model refusal content (OpenAI passthrough)
tool_calls?: ToolCall[];
annotations?: unknown[]; // URL citations and model annotations (OpenAI-routed models)
}
Response (streaming)
Returns a Stream that yields ChatCompletionChunk objects.
const stream = await client.chat.completions.create({ stream: true, ... });
for await (const chunk of stream) {
chunk.choices[0]?.delta?.content; // incremental content
chunk.choices[0]?.delta?.reasoning_content; // incremental reasoning text (if enabled)
chunk.choices[0]?.delta?.reasoning_signature; // signature for current thinking block
chunk.choices[0]?.delta?.reasoning_redacted_data; // encrypted redacted thinking data
}
stream.usage; // available after iteration
stream.routing_metadata; // available after iteration
stream.responseHeaders; // available immediately
stream.isClosed; // boolean
stream.close(); // manual cleanup
Responses
client.responses.create(params)
Creates a response using the OpenAI Response API format. Supports single-model and multi-model routing.
// Non-streaming
const response = await client.responses.create({
model: "gpt-4o",
input: "Hello!",
});
// Streaming
const stream = await client.responses.create({
model: "gpt-4o",
input: "Hello!",
stream: true,
});
Parameters
| Parameter | Type | Required | Description |
|---|
input | string | ResponseInputItemParam[] | Yes | Text string or structured input items |
model | string | Yes* | Model ID (*or use gateway.models for multi-model routing) |
stream | boolean | No | Enable streaming (default: false) |
instructions | string | No | System instructions for the model |
tools | ResponseToolParam[] | No | Tool definitions |
tool_choice | string | Record<string, unknown> | No | Tool selection: "auto", "none", "required", or function spec |
parallel_tool_calls | boolean | No | Allow parallel function calls |
max_output_tokens | number | No | Max tokens to generate |
temperature | number | No | Sampling temperature (0–2) |
top_p | number | No | Nucleus sampling (0–1) |
top_k | number | No | Top-K sampling |
top_logprobs | number | No | Number of top logprobs per token (0–20) |
reasoning | ResponseReasoningParam | No | Reasoning config: effort, summary, generate_summary |
text | Record<string, unknown> | No | Text format config (e.g., { format: { type: "json_schema", ... } }) |
user | string | No | End-user identifier |
metadata | Record<string, string> | No | Arbitrary key-value metadata |
include | string[] | No | Additional data to include in the response |
truncation | string | No | Truncation strategy for long inputs |
prompt_cache_key | string | No | Key for prompt caching |
safety_identifier | string | No | Safety policy identifier |
gateway | GatewayOptions | Record<string, unknown> | No | Gateway namespace for routing, multi-model, and metadata options |
extensions | Extensions | Record<string, unknown> | No | Provider-specific extensions |
extra_body | Record<string, unknown> | No | Additional body fields (merged last) |
Response (non-streaming)
interface ResponseObject {
id: string;
object: "response";
created_at: number;
model: string;
status: "completed" | "failed" | "incomplete" | "in_progress";
output: ResponseOutputItem[];
output_text: string;
parallel_tool_calls: boolean;
tool_choice: unknown;
tools: unknown[];
usage?: ResponseUsage | null;
error?: ResponseError | null;
incomplete_details?: Record<string, string> | null;
metadata?: Record<string, string> | null;
routing_metadata?: RoutingMetadata | null;
responseHeaders: ResponseHeaders;
}
Response (streaming)
Returns a ResponseStream that yields Response API events.
const stream = await client.responses.create({
model: "gpt-4o",
input: "Hello!",
stream: true,
});
for await (const event of stream) {
if (event.type === "response.output_text.delta") {
process.stdout.write(event.delta);
}
}
// After iteration, the terminal event's response is available:
const final = stream.completedResponse; // ResponseObject from the terminal event
final?.usage; // token usage
final?.routing_metadata; // routing details
stream.responseHeaders; // available immediately (before iteration)
stream.close(); // manual cleanup
routing_metadata on completedResponse is available for both streaming and non-streaming responses. For streaming, it’s populated after iteration completes.
Models
Query the model catalog:
const models = await client.models.list(); // GET /v1/models
const model = await client.models.retrieve("gpt-4o"); // GET /v1/models/{model_id}
const directory = await client.models.listDirectory(); // GET /v1/directory/models
const registry = await client.models.listRegistry(); // GET /v1/registry/models
const providers = await client.models.listProviders(); // GET /v1/registry/providers
Identity
Get current API key identity:
const identity = await client.me.get(); // GET /v1/me
// Returns: { object, user_id, workspace_id, tier, rate_limit_rpm }
Error Classes
All errors extend AurikoAPIError. Dispatch is driven by the type field of the canonical error envelope (see Errors for the full envelope and retry policy).
| Error Class | HTTP | type |
|---|
BadRequestError | 400 / 413 / 422 | invalid_request_error |
AuthenticationError | 401 | authentication_error |
PermissionDeniedError | 403 | permission_error |
NotFoundError | 404 | not_found_error |
ConflictError | 409 | invalid_request_error |
RateLimitError | 429 | rate_limit_error |
InternalServerError | 500 | api_error |
APIStatusError | 502 / 503 / 504 | api_error |
APIConnectionError | — | (network failure before response) |
AurikoAPIError Fields
| Field | Type | Description |
|---|
message | string | Human-readable error description (inherited from Error) |
statusCode | number | HTTP status code |
code | string | Machine-readable error code (see Error Codes) |
type | string | Canonical error type (one of six values) |
param | string | null | Parameter that caused the error, when attributable |
requestId | string | Value of x-request-id on the failing response |
docUrl | string | undefined | Link to the error’s docs page |
retryAfterSeconds | number | undefined | Retry-After header value (429 / 503 only) |
provider | string | undefined | Upstream provider that produced this error, when attributable |
import { Client, RateLimitError, AuthenticationError } from "@auriko/sdk";
try {
await client.chat.completions.create({ ... });
} catch (e) {
if (e instanceof RateLimitError) {
console.log(`retry after ${e.retryAfterSeconds}s (requestId=${e.requestId})`);
} else if (e instanceof AuthenticationError) {
console.log(`${e.message} (requestId=${e.requestId})`);
}
}
Unknown error responses fall through to the base AurikoAPIError class. Always keep a catch-all for forward compatibility.
mapErrorFromCode(code, message, responseHeaders, opts?) constructs a typed AurikoAPIError subclass from an error code string (e.g., "rate_limit_error" → RateLimitError):
import { mapErrorFromCode, RateLimitError } from "@auriko/sdk";
const err = mapErrorFromCode("rate_limit_error", "Too many requests", responseHeaders);
if (err instanceof RateLimitError) {
console.log(err.retryAfterSeconds);
}
Available on ChatCompletion.responseHeaders, Stream.responseHeaders, and ResponseObject.responseHeaders:
response.responseHeaders.requestId; // X-Request-ID
response.responseHeaders.rateLimitRemaining; // X-RateLimit-Remaining-Requests
response.responseHeaders.rateLimitLimit; // X-RateLimit-Limit-Requests
response.responseHeaders.rateLimitReset; // X-RateLimit-Reset-Requests
response.responseHeaders.creditsBalanceMicrodollars; // X-Credits-Balance-Microdollars
response.responseHeaders.get("x-custom-header"); // any header by name
response.responseHeaders.getAll("x-multi-header"); // string[] for multi-value headers
Constants
Runtime enum objects for routing configuration:
import { Optimize, Mode, DataPolicy } from "@auriko/sdk";
// Optimize strategy
Optimize.COST // "cost"
Optimize.COST_FOCUS // "cost-focus"
Optimize.TTFT // "ttft"
Optimize.TTFT_FOCUS // "ttft-focus"
Optimize.TPS // "tps"
Optimize.TPS_FOCUS // "tps-focus"
Optimize.BALANCED // "balanced"
// Routing mode
Mode.POOL // "pool"
Mode.FALLBACK // "fallback"
// Data policy
DataPolicy.NONE // "none"
DataPolicy.NO_TRAINING // "no_training"
DataPolicy.ZDR // "zdr"
Types
All types use snake_case field names matching the wire format:
Client & Stream
import { Client, Stream, ResponseStream } from "@auriko/sdk";
Chat Response Types
import type {
ChatCompletion, ChatCompletionChunk, Choice, ChoiceMessage,
ReasoningBlock, StreamChoice, Delta, ToolCall, ToolCallFunction,
ToolCallDelta, ToolCallDeltaFunction,
} from "@auriko/sdk";
Response Types
Common types for the Response API. The SDK exports all event types in the ResponseStreamEvent union — import individual event types (e.g., ResponseTextDeltaEvent, ResponseFunctionCallArgumentsDoneEvent) as needed.
import type {
ResponseObject, ResponseObjectBase, ResponseUsage, ResponseError,
ResponseOutputItem, ResponseMessageOutputItem,
ResponseFunctionCallOutputItem, ResponseReasoningOutputItem,
ResponseOutputContentPart, ResponseReasoningSummary,
ResponseStreamEvent, ResponseCreateParams,
ResponseInputItemParam, ResponseToolParam, ResponseReasoningParam,
} from "@auriko/sdk";
Common Types
import type { Usage, PromptTokensDetails, CompletionTokensDetails, ApiKeyIdentity } from "@auriko/sdk";
Routing Types
import type { GatewayOptions, RoutingOptions, RoutingMetadata, CostInfo, StructuredWarning, StructuredWarningType } from "@auriko/sdk";
Extensions
import type { Extensions } from "@auriko/sdk";
| Field | Type | Description |
|---|
anthropic | Record<string, unknown> | Anthropic-specific parameters |
openai | Record<string, unknown> | OpenAI-specific parameters |
google | Record<string, unknown> | Google-specific parameters |
deepseek | Record<string, unknown> | DeepSeek-specific parameters |
[key] | Record<string, unknown> | Arbitrary provider passthrough |
Model Discovery Types
import type { DirectoryResponse, ModelsListResponse, ProviderList } from "@auriko/sdk";
Request Types
import type { ChatCompletionCreateParams, ClientOptions } from "@auriko/sdk";
Runtime Constants
import { Optimize, Mode, DataPolicy, ResponseHeaders } from "@auriko/sdk";
Error Classes
import {
AurikoAPIError, APIConnectionError, APIStatusError,
AuthenticationError, BadRequestError, ConflictError,
InternalServerError, NotFoundError, PermissionDeniedError,
RateLimitError, mapErrorFromCode,
} from "@auriko/sdk";