Documentation Index Fetch the complete documentation index at: https://docs.auriko.ai/llms.txt
Use this file to discover all available pages before exploring further.
The @auriko/sdk package provides a typed TypeScript client for the Auriko API.
Full SDK Reference Complete API reference with all types, parameters, and examples
Installation
npm install @auriko/sdk
# or
yarn add @auriko/sdk
# or
pnpm add @auriko/sdk
Get started
import { Client } from "@auriko/sdk" ;
const client = new Client (); // reads AURIKO_API_KEY from environment
const response = await client . chat . completions . create ({
model: "gpt-5.4" ,
messages: [{ role: "user" , content: "Hello!" }],
});
console . log ( response . choices [ 0 ]. message . content );
API Key
// Option 1: Auto-detect from AURIKO_API_KEY env var (recommended)
const client = new Client ();
// Option 2: Pass explicitly
const client = new Client ({
apiKey: process . env . AURIKO_API_KEY ,
});
Base URL
// Default: https://api.auriko.ai/v1
// Override for self-hosted or proxy setups:
const client = new Client ({
baseUrl: "https://your-proxy.example.com/v1" ,
});
Timeout
const client = new Client ({
timeout: 60000 , // milliseconds
});
Retries
const client = new Client ({
maxRetries: 3 , // default is 2
});
Create chat completions
Basic request
Send a chat completion request:
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [
{ role: "system" , content: "You are a helpful assistant." },
{ role: "user" , content: "What is 2+2?" },
],
});
console . log ( response . choices [ 0 ]. message . content );
With routing options
import { Optimize } from "@auriko/sdk" ;
const response = await client . chat . completions . create ({
model: "gpt-5.4" ,
messages: [{ role: "user" , content: "Hello!" }],
gateway: {
routing: {
optimize: "cost" ,
max_ttft_ms: 1000 ,
},
},
});
// Access routing metadata
console . log ( `Provider: ${ response . routing_metadata ?. provider } ` );
if ( response . routing_metadata ?. cost ) {
console . log ( `Cost: $ ${ response . routing_metadata . cost . usd } ` );
}
You can also use the RoutingOptions type with enum constants for IDE autocomplete:
import { Optimize } from "@auriko/sdk" ;
import type { RoutingOptions } from "@auriko/sdk" ;
const routing : RoutingOptions = {
optimize: Optimize . COST ,
max_ttft_ms: 1000 ,
};
All routing fields:
Field Type Description optimizeOptimizeStrategy: "cost", "cost-focus", "ttft", "ttft-focus", "tps", "tps-focus", "balanced" weightsRoutingWeightsCustom scoring weights: cost, ttft, throughput. Overrides preset. ttft_percentileMetricPercentileTTFT scoring percentile: "p50" (default) or "p95" throughput_percentileMetricPercentileThroughput scoring percentile: "p50" (default) or "p95" max_cost_per_1mnumberMax $ per 1M tokens (average of input + output) max_ttft_msnumberMax TTFT in milliseconds min_throughput_tpsnumberMin throughput in tokens/sec providersstring[]Allowlist of providers exclude_providersstring[]Blocklist of providers preferstringPreferred provider (soft preference) modeMode"pool" (default) or "fallback"allow_fallbacksbooleanEnable fallback on failure max_fallback_attemptsnumberMax fallback retries data_policyDataPolicy"none", "no_training", "zdr"only_byokbooleanOnly use BYOK providers only_platformbooleanOnly use platform providers
See Advanced Routing for detailed strategy guides.
Multi-model routing
Route a request across multiple models. The router picks the best option based on your routing strategy:
const response = await client . chat . completions . create ({
gateway: {
models: [ "gpt-4o" , "claude-sonnet-4-20250514" , "gemini-2.5-flash" ],
routing: { optimize: "cost" },
},
messages: [{ role: "user" , content: "Explain quantum computing briefly." }],
});
console . log ( `Model used: ${ response . model } ` );
console . log ( `Provider: ${ response . routing_metadata ?. provider } ` );
console . log ( response . choices [ 0 ]. message . content );
model and gateway.models are mutually exclusive. Specify exactly one. Passing both raises BadRequestError.
Reasoning effort
Enable extended reasoning for complex tasks using the reasoning_effort parameter:
const response = await client . chat . completions . create ({
model: "claude-sonnet-4-6" ,
messages: [{ role: "user" , content: "Solve step by step: what is 23! / 20!?" }],
reasoning_effort: "high" ,
});
// Access the reasoning output (if the model returns it)
if ( response . choices [ 0 ]. message . reasoning_content ) {
console . log ( `Reasoning: ${ response . choices [ 0 ]. message . reasoning_content } ` );
}
console . log ( `Answer: ${ response . choices [ 0 ]. message . content } ` );
You can also pass provider-specific parameters through extensions:
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
extensions: { openai: { logit_bias: { "1234" : - 100 } } },
});
See Extensions and Thinking for provider details and streaming thinking output.
Attach metadata to requests for tracking and analytics:
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
gateway: { metadata: { user_id: "user-123" , tags: [ "premium" ] } },
});
Valid metadata fields: user_id, tags (list), trace_id, and custom_fields (object for arbitrary key-value pairs). See the TypeScript SDK Reference for field constraints.
Stream responses
const stream = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Count to 10" }],
stream: true ,
});
for await ( const chunk of stream ) {
if ( chunk . choices [ 0 ]?. delta ?. content ) {
process . stdout . write ( chunk . choices [ 0 ]. delta . content );
}
}
After consuming all chunks, access stream-level metadata:
console . log ( ` \n Provider: ${ stream . routing_metadata ?. provider } ` );
console . log ( `Tokens: ${ stream . usage ?. total_tokens } ` );
console . log ( `Request ID: ${ stream . responseHeaders . requestId } ` );
console . log ( `Closed: ${ stream . isClosed } ` );
Close a stream manually with stream.close().
Routing metadata, usage, and response headers are available only after consuming all chunks.
See Streaming Guide for full patterns including tool call streaming.
const tools = [
{
type: "function" as const ,
function: {
name: "get_weather" ,
description: "Get weather for a city" ,
parameters: {
type: "object" ,
properties: {
city: { type: "string" },
},
required: [ "city" ],
},
},
},
];
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "What's the weather in Paris?" }],
tools ,
});
if ( response . choices [ 0 ]. message . tool_calls ) {
const toolCall = response . choices [ 0 ]. message . tool_calls [ 0 ];
console . log ( `Function: ${ toolCall . function . name } ` );
console . log ( `Arguments: ${ toolCall . function . arguments } ` );
}
See Tool Calling Guide for multi-turn tool conversations.
Create responses
Send a request using the OpenAI Response API format:
const response = await client . responses . create ({
model: "gpt-4o" ,
input: "What is the capital of France?" ,
});
console . log ( response . output_text );
Stream Response API events
const stream = await client . responses . create ({
model: "gpt-4o" ,
input: "Count to 10" ,
stream: true ,
});
for await ( const event of stream ) {
if ( event . type === "response.output_text.delta" ) {
process . stdout . write ( event . delta );
}
}
console . log ( ` \n Tokens: ${ stream . completedResponse ?. usage ?. total_tokens } ` );
See the TypeScript SDK Reference for all parameters and event types.
Every response and error includes a responseHeaders object with typed accessors:
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
});
response . responseHeaders . requestId ; // string | undefined
response . responseHeaders . rateLimitRemaining ; // number | undefined
response . responseHeaders . rateLimitLimit ; // number | undefined
response . responseHeaders . rateLimitReset ; // string | undefined
response . responseHeaders . creditsBalanceMicrodollars ; // number | undefined
response . responseHeaders . get ( "x-custom-header" ); // generic lookup
response . responseHeaders . getAll ( "x-multi-header" ); // string[] for multi-value headers
Property Header Type requestIdx-request-idstring | undefinedrateLimitRemainingx-ratelimit-remaining-requestsnumber | undefinedrateLimitLimitx-ratelimit-limit-requestsnumber | undefinedrateLimitResetx-ratelimit-reset-requestsstring | undefinedcreditsBalanceMicrodollarsx-credits-balance-microdollarsnumber | undefined
Error objects also carry responseHeaders. Use e.responseHeaders.requestId when filing support tickets to correlate with server logs.
See the TypeScript SDK Reference for the complete ResponseHeaders API.
Read token usage
The Usage object on every response carries optional detail breakdowns:
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
});
const usage = response . usage ;
// Prompt token breakdown
if ( usage ?. prompt_tokens_details ) {
console . log ( `Cached: ${ usage . prompt_tokens_details . cached_tokens } ` );
}
// Completion token breakdown
if ( usage ?. completion_tokens_details ) {
console . log ( `Reasoning: ${ usage . completion_tokens_details . reasoning_tokens } ` );
}
Field Sub-fields Type prompt_tokens_detailscached_tokensnumber | undefinedcompletion_tokens_detailsreasoning_tokensnumber | undefined
Availability depends on the provider. completion_tokens_details.reasoning_tokens is present for OpenAI o-series, DeepSeek, xAI, and Google Gemini. It’s undefined for providers that don’t report reasoning token counts (Anthropic, Moonshot, Fireworks).
See Check reasoning token availability for the full breakdown.
Handle errors
Catch typed exceptions:
import {
Client ,
AurikoAPIError ,
APIConnectionError ,
APIStatusError ,
AuthenticationError ,
BadRequestError ,
ConflictError ,
InternalServerError ,
NotFoundError ,
PermissionDeniedError ,
RateLimitError ,
} from "@auriko/sdk" ;
const client = new Client ();
try {
const response = await client . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
});
} catch ( e ) {
if ( e instanceof AuthenticationError ) {
console . log ( `Check your API key: ${ e . message } (requestId= ${ e . requestId } )` );
} else if ( e instanceof RateLimitError ) {
console . log ( `Rate limited: retry after ${ e . retryAfterSeconds } s` );
} else if ( e instanceof PermissionDeniedError ) {
console . log ( `Permission denied (code= ${ e . code } ): ${ e . message } ` );
} else if ( e instanceof NotFoundError ) {
console . log ( `Not found: ${ e . message } ` );
} else if ( e instanceof BadRequestError ) {
console . log ( `Invalid request (param= ${ e . param } ): ${ e . message } ` );
} else if ( e instanceof APIStatusError ) {
console . log ( `Upstream/api error ( ${ e . statusCode } , code= ${ e . code } ): ${ e . message } ` );
} else if ( e instanceof APIConnectionError ) {
console . log ( `Network failure before response: ${ e . message } ` );
} else if ( e instanceof AurikoAPIError ) {
console . log ( `API error ( ${ e . statusCode } ): ${ e . message } ` );
}
}
See Error Handling Guide for retry patterns.
Use identity and model discovery APIs
Query identity and model information:
// Identity (discover your workspace)
const identity = await client . me . get ();
// Models
const models = await client . models . list ();
const model = await client . models . retrieve ( "claude-sonnet-4-6" );
const registry = await client . models . listRegistry ();
const directory = await client . models . listDirectory ();
const providers = await client . models . listProviders ();
Model listing choices
Method Returns Use when list()All models with provider availability, pricing, data policy You need the full model catalog retrieve(modelId)Single model: provider availability, pricing, data policy You have a model ID and need its details listRegistry()Flat list: id, family, display_name You need a quick model ID lookup listDirectory()Rich detail: provider entries, context windows, capabilities, pricing tiers You need to compare providers or check capabilities listProviders()Provider catalog: display name, description, data policy You need to see available providers
See the TypeScript SDK Reference for the complete API.
SDK scope
The Auriko SDK covers: inference (chat completions and the Response API, both with routing), identity, and model discovery. For full platform operations, use the REST API directly. If you use the Vercel AI SDK, see @auriko/ai-sdk-provider instead.
Use TypeScript types
The SDK provides typed responses, errors, and routing configuration. Import types directly:
import type {
ChatCompletion ,
ChatCompletionChunk ,
ChoiceMessage ,
Choice ,
Usage ,
RoutingMetadata ,
RoutingOptions ,
Extensions ,
ResponseObject ,
ResponseStreamEvent ,
ResponseCreateParams ,
} from "@auriko/sdk" ;
Node.js, Deno, and Browser
The SDK works in multiple environments:
Node.js
import { Client } from "@auriko/sdk" ;
const client = new Client (); // reads AURIKO_API_KEY from env
Deno
import { Client } from "npm:@auriko/sdk" ;
const client = new Client ({
apiKey: Deno . env . get ( "AURIKO_API_KEY" ),
});
Browser (with bundler)
import { Client } from "@auriko/sdk" ;
// Pass API key from your backend - never expose in client-side code!
const client = new Client ({
apiKey: apiKeyFromBackend ,
});
Never expose your API key in client-side code. Use a backend proxy instead.