LlamaIndex - Auriko

Use Auriko as your LLM provider in LlamaIndex. This integration is Python-only. For TypeScript, use the Vercel AI SDK integration.

Prerequisites

An Auriko API key

Install

pip install "auriko[llamaindex]"

Use SDK adapter

Use the AurikoLlamaIndexLLM adapter:

from auriko.frameworks.llamaindex import AurikoLlamaIndexLLM

llm = AurikoLlamaIndexLLM(model="gpt-5.4")

AurikoLlamaIndexLLM supports chat, completion, streaming, async, per-call routing overrides, and Auriko error mapping.

from auriko.frameworks.llamaindex import AurikoLlamaIndexLLM
from llama_index.core.llms import ChatMessage

llm = AurikoLlamaIndexLLM(model="gpt-5.4")

response = llm.chat([ChatMessage(role="user", content="What is 2+2?")])
print(response.message.content)

for chunk in llm.stream_chat([ChatMessage(role="user", content="Count to 5")]):
    print(chunk.delta, end="", flush=True)

Configure options

Parameter	Type	Default	Description
`model`	`str`	(required)	Model ID
`api_key`	`str \| None`	`AURIKO_API_KEY` env	API key
`routing`	`RoutingOptions \| None`	`None`	Default routing configuration
`api_base`	`str`	`"https://api.auriko.ai/v1"`	API base URL
`**kwargs`			Passed through to LlamaIndex’s `OpenAI` (e.g., `temperature`, `max_tokens`)

Configure routing

Pass a RoutingOptions instance to set default routing:

from auriko.frameworks.llamaindex import AurikoLlamaIndexLLM
from auriko.route_types import RoutingOptions

llm = AurikoLlamaIndexLLM(
    model="gpt-5.4",
    routing=RoutingOptions(optimize="cost"),
)

Per-call routing overrides the instance default:

from auriko.route_types import RoutingOptions

response = llm.chat(
    [ChatMessage(role="user", content="Hello!")],
    routing=RoutingOptions(optimize="tps-focus"),
)

Access routing metadata from the response:

response = llm.chat([ChatMessage(role="user", content="Hello!")])
metadata = response.additional_kwargs.get("routing_metadata")
if metadata:
    print(f"Provider: {metadata['provider']}")

Configure manually

Alternative: configure LlamaIndex manually

If you prefer to use LlamaIndex’s OpenAI class directly:

import os
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-5.4",
    api_key=os.environ["AURIKO_API_KEY"],
    api_base="https://api.auriko.ai/v1",
)

For routing options, per-call overrides, and Auriko error mapping, use AurikoLlamaIndexLLM.

Use `AurikoAsyncOpenAI` (experimental)

If your project pins a different llama-index-llms-openai version, pass AurikoAsyncOpenAI as the async_openai_client:

from llama_index.llms.openai import OpenAI
from auriko import AurikoAsyncOpenAI

client = AurikoAsyncOpenAI()
llm = OpenAI(
    model="gpt-4o",
    async_openai_client=client,
    api_key="placeholder",
)

LlamaIndex’s OpenAI requires an api_key for construction. Pass any placeholder value.

Read client.last_routing_metadata after each call. See AurikoAsyncOpenAI for the full class reference.

​Prerequisites

​Install

​Use SDK adapter

​Configure options

​Configure routing

​Configure manually

​Use AurikoAsyncOpenAI (experimental)

Prerequisites

Install

Use SDK adapter

Configure options

Configure routing

Configure manually

Use `AurikoAsyncOpenAI` (experimental)