Vision - Auriko

Auriko supports vision through the OpenAI-compatible image_url content part. Pass an image URL or a base64 data URL in the content array of a user message.

Prerequisites

An Auriko API key
Python 3.10+ with the OpenAI SDK (pip install openai) or the auriko SDK (pip install auriko)
- OR Node.js 18+ with the OpenAI SDK (npm install openai) or @auriko/sdk (npm install @auriko/sdk)
A vision-capable model (e.g., gpt-4o, claude-sonnet-4-6, gemini-flash-latest)

Analyze images from URLs

Pass an image URL as a content part in the user message:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["AURIKO_API_KEY"],
    base_url="https://api.auriko.ai/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {
                "url": "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"
            }},
        ],
    }],
    max_tokens=300,
)

print(response.choices[0].message.content)

Analyze base64-encoded images

For local files or private images, encode the bytes as a data URL:

import base64
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["AURIKO_API_KEY"],
    base_url="https://api.auriko.ai/v1",
)

with open("chart.png", "rb") as f:
    b64 = base64.b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Summarize the trend in this chart."},
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{b64}"}},
        ],
    }],
    max_tokens=500,
)

print(response.choices[0].message.content)

Send multiple images

Send several images in a single request by adding multiple image_url content parts:

messages=[{
    "role": "user",
    "content": [
        {"type": "text", "text": "Compare these two images."},
        {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"}},
        {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"}},
    ],
}]

Control image resolution

You can set detail on the image_url content part to control how much resolution the model uses:

{
    "type": "image_url",
    "image_url": {
        "url": "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png",
        "detail": "low",
    },
}

Value	Behavior
`auto`	The model decides based on image size (default)
`low`	Fixed low-resolution processing, fewer tokens
`high`	High-resolution processing, more tokens for fine detail

Use low for cost-sensitive workloads where fine detail isn’t needed. Use high when the model needs to read small text or distinguish fine visual features.

Response shape

Vision responses use the standard ChatCompletionResponse shape. The model’s analysis appears in choices[0].message.content as text.

Errors

Situation	HTTP	SDK error
Image too large for the model’s context window	`400`	`BadRequestError`
Model doesn’t support vision	`400`	`BadRequestError`

Some models accept image URLs directly. Others require Auriko to process the image first, which adds these constraints:

Situation	HTTP	SDK error
Image URL unreachable	`400`	`BadRequestError`
Total image data exceeds 30 MB	`400`	`BadRequestError`
More than 1,500 images in one request	`400`	`BadRequestError`
Image URL isn’t HTTPS	`400`	`BadRequestError`
Unsupported image format	`400`	`BadRequestError`

URL resolution behavior varies by model. For consistent results across models, use base64-encoded images.

Check Supported parameters for the accepted content part types and see Error codes for the full error taxonomy.

Streaming — stream vision responses chunk-by-chunk
Tool calling — combine vision with function calling
Structured output — extract structured data from images
Image generation — generate images with Gemini models

​Prerequisites

​Analyze images from URLs

​Analyze base64-encoded images

​Send multiple images

​Control image resolution

​Response shape

​Errors

​Related

Prerequisites

Analyze images from URLs

Analyze base64-encoded images

Send multiple images

Control image resolution

Response shape

Errors

Related