Documentation

Intro

AI Router gives you a single OpenAI-compatible API endpoint for multiple models hosted in Switzerland. Everything is included in one flat monthly subscription — no per-token pricing, no surprises. This page covers the essentials to get you up and running.

Getting Started

AI Router provides a single OpenAI-compatible API endpoint for all models. Set your base URL, authenticate with an API key, and start sending requests in minutes. No special SDKs needed — any OpenAI client works as a drop-in replacement.

link Base URL

export AIROUTER_BASE_URL="https://api.airouter.ch/v1"

All requests go to this endpoint. The API follows the OpenAI API format for chat completions, embeddings, and streaming.

key Authentication

Authenticate by including your API key in the Authorization header. Keys are generated per user and available in your dashboard after subscribing.

export AIROUTER_API_KEY="sk-your-key-here" curl -H "Authorization: Bearer $AIROUTER_API_KEY" ...

terminal Quick Start

Send your first chat completion with cURL:

curl "https://api.airouter.ch/v1/chat/completions" \ -H "Authorization: Bearer $AIROUTER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen3.6", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of Switzerland?"} ], "temperature": 0.7, "max_tokens": 4096 }'

code Python (OpenAI SDK)

Use the official OpenAI Python library — just change the base URL:

from openai import OpenAI client = OpenAI( api_key="sk-your-key-here", base_url="https://api.airouter.ch/v1" ) response = client.chat.completions.create( model="Qwen3.6", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)

check_circle

OpenAI-compatible: Any SDK or tool that speaks the OpenAI API format works with AI Router. Just set base_url to https://api.airouter.ch/v1 and use your API key. Works with cURL, Python, Node.js, LangChain, OpenClaw, Claude Code, OpenCode, and more.

Models

AI Router provides access to three models through a single API endpoint. All models are hosted in Switzerland, included in the flat monthly subscription, and accessible via the OpenAI-compatible format.

table Model Comparison

Property	Qwen3.6	DeepSeek-V4-Flash	Qwen3-Embedding
Model ID	`Qwen3.6`	`DeepSeek-V4-Flash`	`Qwen3-Embedding`
Parameters	27B (dense)	284B (13B active, MoE)	4B (dense)
Context	262,144 tokens	262,144 tokens	32,768 tokens
Max output	65,536 tokens	65,536 tokens	—
Reasoning	✓	✓	—
Tool calling	✓	✓	—
Image input	✓	✓ augmented	—
Structured output	✓	✓	—
Streaming	✓ (SSE)	✓ (SSE)	—
Languages	119	100+	100+ (incl. code)
Embedding dims	—	—	2,560

info

Model aliases: You can use DeepSeek-V4-Flash or the shorter deepseek-v4. For Qwen3.6, use Qwen3.6. For embeddings, use Qwen3-Embedding.

DeepSeek Vision Augmentation

DeepSeek-V4-Flash is a text-only model and does not natively support image inputs. AI Router bridges this gap by automatically routing image-containing requests through Qwen3.6, which natively supports vision. The augmentation is transparent — you send the same OpenAI-compatible request with image content and we handle the rest.

swap_horiz How It Works

Send your request — Include image content in the usual OpenAI format (content: [{ "type": "image_url", ... }]) using DeepSeek-V4-Flash as the model.
Augmentation detects image content — The middleware detects image data in your request and routes it to Qwen3.6 for visual processing.
Seamless response — You get a response as if DeepSeek handled the image itself. No special endpoint, no extra configuration.
Follow-up turns use DeepSeek — Subsequent messages in the same conversation are routed back to DeepSeek-V4-Flash, including the image analysis context from Qwen3.6. DeepSeek reasons over the visual information and continues the conversation normally.

info

Transparent: Use the same SDK, same endpoint, same API key. Vision augmentation is applied automatically when needed.

Configuration

Both models support standard OpenAI-compatible parameters. API calls always use the model's built-in defaults unless you override them. Below are the parameters you're most likely to need.

psychology reasoning_effort

Controls how much reasoning computation the model performs before responding. Higher effort → more thorough reasoning, higher latency, better results on complex tasks. Default is "high" for both models.

Value	Qwen3.6	DeepSeek	Actual behavior
`"none"`	✓	—	Reasoning off. Direct answer, no chain-of-thought. Qwen3.6 only.
`"high"`	✓	✓	Default. Full reasoning. Good for most tasks. `"low"` → `"high"` on DeepSeek. `"medium"` → `"high"` on DeepSeek. On Qwen3.6, all non-`none` values behave identically (on/off, no levels).
`"max"`	✓	✓	Maximum reasoning. Higher latency, best for hard problems. `"xhigh"` → `"max"` on DeepSeek.

info

DeepSeek mapping reference: "low" / "medium" → "high" · "xhigh" → "max". Non-thinking mode is not available through reasoning_effort and is not currently supported at AI Router for DeepSeek.

info

Qwen3.6: Reasoning is on by default. There are no internal reasoning levels — it's simply on or off. "none" disables reasoning; "low", "medium", "high", and "max" all enable it identically.

thermostat Sampling Parameters

Parameter	Type	Range	Default	Description
`temperature`	number	0.0 – 2.0	1.0	Controls randomness. Lower → deterministic, higher → creative. At 0, the model always picks the most likely token.
`top_p`	number	0.0 – 1.0	1.0	Nucleus sampling — only tokens whose cumulative probability reaches `top_p` are considered. Adjust one or the other, not both.
`top_k`	integer	0 – 100	20	Limits the sampling pool to the k most likely tokens. 0 = disabled (all tokens considered).
`min_p`	number	0.0 – 1.0	0	Minimum token probability relative to the most likely token. Dynamically filters unlikely tokens. 0 = disabled.
`presence_penalty`	number	-2.0 – 2.0	0	Penalizes tokens that have already appeared, encouraging the model to introduce new topics. 0 = no penalty.
`repetition_penalty`	number	1.0 – 2.0	1	Penalizes tokens based on their frequency in the output. 1.0 = no penalty; higher values discourage repetition.

info

Qwen3.6 recommended values (model card): Thinking: temperature 0.6, top_p 0.95, top_k 20, min_p 0. Non-thinking: temperature 0.7, top_p 0.8, top_k 20, min_p 0. Avoid greedy decoding (temperature 0) in thinking mode.

warning

DeepSeek thinking mode: Temperature, top_p, presence_penalty, and frequency_penalty have no effect when thinking mode is enabled (default). They are silently ignored. Sampling parameters are meaningful for Qwen3.6 only.

data_thresholding Max Tokens & Stop

Parameter	Type	Max	Default	Description
`max_tokens`	integer	65,536	—	Maximum tokens the model can generate in a single response. Up to ~197K output available (262K context minus input).

Subscription

AI Router uses a simple flat-rate subscription model. One price, unlimited access, no surprises. Below is everything you need to know about subscribing, what you get, and how billing works.

Individual / Freelancer

One flat price, unlimited access to all models — perfect for solo developers, freelancers, hobbyists, and personal projects.

how_to_reg How to Subscribe

Visit airouter.ch — Click "Subscribe Now" on the landing page. You'll be redirected to Stripe's secure checkout.
Complete payment — Enter your email and payment details. Stripe handles all billing — we never see your card information.
Receive your API key — Immediately after successful payment, your API key is generated and available in your dashboard. You'll also receive a welcome email.
Set a password — Follow the link in your welcome email or visit the dashboard to set a password for account access.
Start building — Use your API key with any OpenAI-compatible SDK. No further setup required.

check_circle

Already subscribed? Head to your dashboard to view your API key, manage your subscription, or generate additional keys.

visibility What You Get

✓ Unlimited API requests (fair use)

✓ Qwen3.6 + DeepSeek-V4-Flash

✓ 262K context window

✓ Qwen3-Embedding access

✓ Vision & image input

✓ OpenAI-compatible API

✓ Streaming support (SSE)

✓ Swiss-hosted, no prompt logging

✓ Reasoning & tool calling

✓ Dashboard & key management

All prices in CHF. Swiss VAT may apply depending on your location. Subscriptions are billed monthly. Cancel anytime — access remains active until the end of the current billing period.

Teams / Business

For teams and organizations that need multi-seat access, consolidated billing, and custom pricing. Business accounts are invitation-only — contact us to get started.

business Features

👥 Multi-seat

One account, multiple API keys

📄 Custom invoices

Tailored to your billing

📊 Volume discounts

Scales with team size

Contact support@airouter.ch to discuss your requirements.

Cancellation

You can cancel your subscription at any time via the Stripe customer portal, accessible from your dashboard. After cancellation, access remains active until the end of the current billing period.

If you need assistance, reach out to support@airouter.ch.

Intro

Getting Started

link Base URL

key Authentication

terminal Quick Start

code Python (OpenAI SDK)

Models

table Model Comparison

DeepSeek Vision Augmentation

swap_horiz How It Works

Configuration

psychology reasoning_effort

thermostat Sampling Parameters

data_thresholding Max Tokens & Stop

Subscription

Individual / Freelancer

how_to_reg How to Subscribe

visibility What You Get

Teams / Business

business Features

Cancellation

There is only one price.