Skip to main content
Use Auriko as your LLM provider in LlamaIndex.

Prerequisites

Installation

pip install "auriko[llamaindex]"

Use SDK adapter

Use the AurikoLlamaIndexLLM adapter:
from auriko.frameworks.llamaindex import AurikoLlamaIndexLLM

llm = AurikoLlamaIndexLLM(model="gpt-5.4")
AurikoLlamaIndexLLM extends LlamaIndex’s OpenAI LLM class with routing injection, per-call routing overrides, and Auriko error mapping.
from auriko.frameworks.llamaindex import AurikoLlamaIndexLLM
from llama_index.core.llms import ChatMessage

llm = AurikoLlamaIndexLLM(model="gpt-5.4")

# Simple chat
response = llm.chat([ChatMessage(role="user", content="What is 2+2?")])
print(response.message.content)

# Streaming
for chunk in llm.stream_chat([ChatMessage(role="user", content="Count to 5")]):
    print(chunk.delta, end="", flush=True)

Configure options

ParameterTypeDefaultDescription
modelstr(required, via parent)Model ID
api_keystr | NoneAURIKO_API_KEY envAPI key
routingRoutingOptions | NoneNoneDefault routing configuration
api_basestr"https://api.auriko.ai/v1"API base URL
**kwargsPassed through to LlamaIndex’s OpenAI (e.g., temperature, max_tokens)

Configure routing

Instance-level routing applies to all requests:
from auriko.frameworks.llamaindex import AurikoLlamaIndexLLM
from auriko.route_types import RoutingOptions

llm = AurikoLlamaIndexLLM(
    model="gpt-5.4",
    routing=RoutingOptions(optimize="cost"),
)
Per-call routing overrides the instance default:
from auriko.route_types import RoutingOptions

# Use cost optimization for this call only
response = llm.chat(
    [ChatMessage(role="user", content="Hello!")],
    routing=RoutingOptions(optimize="speed"),
)
Access routing metadata from the response:
response = llm.chat([ChatMessage(role="user", content="Hello!")])
metadata = response.additional_kwargs.get("routing_metadata")
if metadata:
    print(f"Provider: {metadata['provider']}")

Configure manually

If you prefer to use LlamaIndex’s OpenAI class directly:
import os
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-5.4",
    api_key=os.environ["AURIKO_API_KEY"],
    api_base="https://api.auriko.ai/v1",
)
Note: routing options, per-call overrides, and Auriko error mapping aren’t available with manual configuration.

Notes

  • AurikoLlamaIndexLLM inherits all LlamaIndex OpenAI capabilities: chat, completion, streaming, async.
  • OpenAI API errors are automatically mapped to typed Auriko error classes (RateLimitError, BudgetExceededError, etc.).
  • Per-call routing overrides are unique to this adapter — pass routing=RoutingOptions(...) to any chat/complete call.