Documentation
API reference, configuration, and subscription guide.
Intro
AI Router gives you a single OpenAI-compatible API endpoint for multiple models hosted in Switzerland. Everything is included in one flat monthly subscription — no per-token pricing, no surprises. This page covers the essentials to get you up and running.
Getting Started
AI Router provides a single OpenAI-compatible API endpoint for all models. Set your base URL, authenticate with an API key, and start sending requests in minutes. No special SDKs needed — any OpenAI client works as a drop-in replacement.
link Base URL
All requests go to this endpoint. The API follows the OpenAI API format for chat completions, embeddings, and streaming.
key Authentication
Authenticate by including your API key in the Authorization header.
Keys are generated per user and available in your dashboard after subscribing.
terminal Quick Start
Send your first chat completion with cURL:
code Python (OpenAI SDK)
Use the official OpenAI Python library — just change the base URL:
base_url to https://api.airouter.ch/v1 and use your API key.
Works with cURL, Python, Node.js, LangChain, OpenClaw, Claude Code, OpenCode, and more.
Models
AI Router provides access to three models through a single API endpoint. All models are hosted in Switzerland, included in the flat monthly subscription, and accessible via the OpenAI-compatible format.
table Model Comparison
| Property | Qwen3.6 | DeepSeek-V4-Flash | Qwen3-Embedding |
|---|---|---|---|
| Model ID | Qwen3.6 | DeepSeek-V4-Flash | Qwen3-Embedding |
| Parameters | 27B (dense) | 284B (13B active, MoE) | 4B (dense) |
| Context | 262,144 tokens | 262,144 tokens | 32,768 tokens |
| Max output | 65,536 tokens | 65,536 tokens | — |
| Reasoning | ✓ | ✓ | — |
| Tool calling | ✓ | ✓ | — |
| Image input | ✓ | ✓ augmented | — |
| Structured output | ✓ | ✓ | — |
| Streaming | ✓ (SSE) | ✓ (SSE) | — |
| Languages | 119 | 100+ | 100+ (incl. code) |
| Embedding dims | — | — | 2,560 |
DeepSeek-V4-Flash or the shorter deepseek-v4.
For Qwen3.6, use Qwen3.6.
For embeddings, use Qwen3-Embedding.
DeepSeek Vision Augmentation
DeepSeek-V4-Flash is a text-only model and does not natively support image inputs. AI Router bridges this gap by automatically routing image-containing requests through Qwen3.6, which natively supports vision. The augmentation is transparent — you send the same OpenAI-compatible request with image content and we handle the rest.
swap_horiz How It Works
-
Send your request — Include image content in the
usual OpenAI format (
content: [{ "type": "image_url", ... }]) usingDeepSeek-V4-Flashas the model. - Augmentation detects image content — The middleware detects image data in your request and routes it to Qwen3.6 for visual processing.
- Seamless response — You get a response as if DeepSeek handled the image itself. No special endpoint, no extra configuration.
- Follow-up turns use DeepSeek — Subsequent messages in the same conversation are routed back to DeepSeek-V4-Flash, including the image analysis context from Qwen3.6. DeepSeek reasons over the visual information and continues the conversation normally.
Configuration
Both models support standard OpenAI-compatible parameters. API calls always use the model's built-in defaults unless you override them. Below are the parameters you're most likely to need.
psychology reasoning_effort
Controls how much reasoning computation the model performs before responding.
Higher effort → more thorough reasoning, higher latency, better results on
complex tasks. Default is "high" for both models.
| Value | Qwen3.6 | DeepSeek | Actual behavior |
|---|---|---|---|
"none" |
✓ | — | Reasoning off. Direct answer, no chain-of-thought. Qwen3.6 only. |
"high" |
✓ | ✓ |
Default. Full reasoning. Good for most tasks.
"low" → "high" on DeepSeek.
"medium" → "high" on DeepSeek.
On Qwen3.6, all non-none values behave identically (on/off, no levels).
|
"max" |
✓ | ✓ |
Maximum reasoning. Higher latency, best for hard problems.
"xhigh" → "max" on DeepSeek.
|
"low" / "medium" → "high" ·
"xhigh" → "max".
Non-thinking mode is not available through reasoning_effort
and is not currently supported at AI Router for DeepSeek.
"none" disables reasoning;
"low", "medium", "high", and
"max" all enable it identically.
thermostat Sampling Parameters
| Parameter | Type | Range | Default | Description |
|---|---|---|---|---|
temperature |
number | 0.0 – 2.0 | 1.0 | Controls randomness. Lower → deterministic, higher → creative. At 0, the model always picks the most likely token. |
top_p |
number | 0.0 – 1.0 | 1.0 |
Nucleus sampling — only tokens whose cumulative probability reaches
top_p are considered. Adjust one or the other, not both.
|
top_k |
integer | 0 – 100 | 20 | Limits the sampling pool to the k most likely tokens. 0 = disabled (all tokens considered). |
min_p |
number | 0.0 – 1.0 | 0 | Minimum token probability relative to the most likely token. Dynamically filters unlikely tokens. 0 = disabled. |
presence_penalty |
number | -2.0 – 2.0 | 0 | Penalizes tokens that have already appeared, encouraging the model to introduce new topics. 0 = no penalty. |
repetition_penalty |
number | 1.0 – 2.0 | 1 | Penalizes tokens based on their frequency in the output. 1.0 = no penalty; higher values discourage repetition. |
data_thresholding Max Tokens & Stop
| Parameter | Type | Max | Default | Description |
|---|---|---|---|---|
max_tokens |
integer | 65,536 | — | Maximum tokens the model can generate in a single response. Up to ~197K output available (262K context minus input). |
Subscription
AI Router uses a simple flat-rate subscription model. One price, unlimited access, no surprises. Below is everything you need to know about subscribing, what you get, and how billing works.
Individual / Freelancer
One flat price, unlimited access to all models — perfect for solo developers, freelancers, hobbyists, and personal projects.
how_to_reg How to Subscribe
- Visit airouter.ch — Click "Subscribe Now" on the landing page. You'll be redirected to Stripe's secure checkout.
- Complete payment — Enter your email and payment details. Stripe handles all billing — we never see your card information.
- Receive your API key — Immediately after successful payment, your API key is generated and available in your dashboard. You'll also receive a welcome email.
- Set a password — Follow the link in your welcome email or visit the dashboard to set a password for account access.
- Start building — Use your API key with any OpenAI-compatible SDK. No further setup required.
visibility What You Get
All prices in CHF. Swiss VAT may apply depending on your location. Subscriptions are billed monthly. Cancel anytime — access remains active until the end of the current billing period.
Teams / Business
For teams and organizations that need multi-seat access, consolidated billing, and custom pricing. Business accounts are invitation-only — contact us to get started.
business Features
One account, multiple API keys
Tailored to your billing
Scales with team size
Contact support@airouter.ch to discuss your requirements.
Cancellation
You can cancel your subscription at any time via the Stripe customer portal, accessible from your dashboard. After cancellation, access remains active until the end of the current billing period.
If you need assistance, reach out to support@airouter.ch.