menu_book

Documentation

API reference, configuration, and subscription guide.

Intro

AI Router gives you a single OpenAI-compatible API endpoint for multiple models hosted in Switzerland. Everything is included in one flat monthly subscription — no per-token pricing, no surprises. This page covers the essentials to get you up and running.

Getting Started

AI Router provides a single OpenAI-compatible API endpoint for all models. Set your base URL, authenticate with an API key, and start sending requests in minutes. No special SDKs needed — any OpenAI client works as a drop-in replacement.

link Base URL

export AIROUTER_BASE_URL="https://api.airouter.ch/v1"

All requests go to this endpoint. The API follows the OpenAI API format for chat completions, embeddings, and streaming.

key Authentication

Authenticate by including your API key in the Authorization header. Keys are generated per user and available in your dashboard after subscribing.

export AIROUTER_API_KEY="sk-your-key-here" curl -H "Authorization: Bearer $AIROUTER_API_KEY" ...

terminal Quick Start

Send your first chat completion with cURL:

curl "https://api.airouter.ch/v1/chat/completions" \ -H "Authorization: Bearer $AIROUTER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen3.6", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of Switzerland?"} ], "temperature": 0.7, "max_tokens": 4096 }'

code Python (OpenAI SDK)

Use the official OpenAI Python library — just change the base URL:

from openai import OpenAI client = OpenAI( api_key="sk-your-key-here", base_url="https://api.airouter.ch/v1" ) response = client.chat.completions.create( model="Qwen3.6", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
check_circle
OpenAI-compatible: Any SDK or tool that speaks the OpenAI API format works with AI Router. Just set base_url to https://api.airouter.ch/v1 and use your API key. Works with cURL, Python, Node.js, LangChain, OpenClaw, Claude Code, OpenCode, and more.

Models

AI Router provides access to three models through a single API endpoint. All models are hosted in Switzerland, included in the flat monthly subscription, and accessible via the OpenAI-compatible format.

table Model Comparison

Property Qwen3.6 DeepSeek-V4-Flash Qwen3-Embedding
Model IDQwen3.6DeepSeek-V4-FlashQwen3-Embedding
Parameters27B (dense)284B (13B active, MoE)4B (dense)
Context262,144 tokens262,144 tokens32,768 tokens
Max output65,536 tokens65,536 tokens
Reasoning
Tool calling
Image input augmented
Structured output
Streaming (SSE) (SSE)
Languages119100+100+ (incl. code)
Embedding dims2,560
info
Model aliases: You can use DeepSeek-V4-Flash or the shorter deepseek-v4. For Qwen3.6, use Qwen3.6. For embeddings, use Qwen3-Embedding.

DeepSeek Vision Augmentation

DeepSeek-V4-Flash is a text-only model and does not natively support image inputs. AI Router bridges this gap by automatically routing image-containing requests through Qwen3.6, which natively supports vision. The augmentation is transparent — you send the same OpenAI-compatible request with image content and we handle the rest.

swap_horiz How It Works

  1. Send your request — Include image content in the usual OpenAI format (content: [{ "type": "image_url", ... }]) using DeepSeek-V4-Flash as the model.
  2. Augmentation detects image content — The middleware detects image data in your request and routes it to Qwen3.6 for visual processing.
  3. Seamless response — You get a response as if DeepSeek handled the image itself. No special endpoint, no extra configuration.
  4. Follow-up turns use DeepSeek — Subsequent messages in the same conversation are routed back to DeepSeek-V4-Flash, including the image analysis context from Qwen3.6. DeepSeek reasons over the visual information and continues the conversation normally.
info
Transparent: Use the same SDK, same endpoint, same API key. Vision augmentation is applied automatically when needed.

Configuration

Both models support standard OpenAI-compatible parameters. API calls always use the model's built-in defaults unless you override them. Below are the parameters you're most likely to need.

psychology reasoning_effort

Controls how much reasoning computation the model performs before responding. Higher effort → more thorough reasoning, higher latency, better results on complex tasks. Default is "high" for both models.

ValueQwen3.6DeepSeekActual behavior
"none" Reasoning off. Direct answer, no chain-of-thought. Qwen3.6 only.
"high" Default. Full reasoning. Good for most tasks. "low""high" on DeepSeek. "medium""high" on DeepSeek. On Qwen3.6, all non-none values behave identically (on/off, no levels).
"max" Maximum reasoning. Higher latency, best for hard problems. "xhigh""max" on DeepSeek.
info
DeepSeek mapping reference: "low" / "medium""high" · "xhigh""max". Non-thinking mode is not available through reasoning_effort and is not currently supported at AI Router for DeepSeek.
info
Qwen3.6: Reasoning is on by default. There are no internal reasoning levels — it's simply on or off. "none" disables reasoning; "low", "medium", "high", and "max" all enable it identically.

thermostat Sampling Parameters

ParameterTypeRangeDefaultDescription
temperature number 0.0 – 2.0 1.0 Controls randomness. Lower → deterministic, higher → creative. At 0, the model always picks the most likely token.
top_p number 0.0 – 1.0 1.0 Nucleus sampling — only tokens whose cumulative probability reaches top_p are considered. Adjust one or the other, not both.
top_k integer 0 – 100 20 Limits the sampling pool to the k most likely tokens. 0 = disabled (all tokens considered).
min_p number 0.0 – 1.0 0 Minimum token probability relative to the most likely token. Dynamically filters unlikely tokens. 0 = disabled.
presence_penalty number -2.0 – 2.0 0 Penalizes tokens that have already appeared, encouraging the model to introduce new topics. 0 = no penalty.
repetition_penalty number 1.0 – 2.0 1 Penalizes tokens based on their frequency in the output. 1.0 = no penalty; higher values discourage repetition.
info
Qwen3.6 recommended values (model card): Thinking: temperature 0.6, top_p 0.95, top_k 20, min_p 0. Non-thinking: temperature 0.7, top_p 0.8, top_k 20, min_p 0. Avoid greedy decoding (temperature 0) in thinking mode.
warning
DeepSeek thinking mode: Temperature, top_p, presence_penalty, and frequency_penalty have no effect when thinking mode is enabled (default). They are silently ignored. Sampling parameters are meaningful for Qwen3.6 only.

data_thresholding Max Tokens & Stop

ParameterTypeMaxDefaultDescription
max_tokens integer 65,536 Maximum tokens the model can generate in a single response. Up to ~197K output available (262K context minus input).

Subscription

AI Router uses a simple flat-rate subscription model. One price, unlimited access, no surprises. Below is everything you need to know about subscribing, what you get, and how billing works.

Individual / Freelancer

One flat price, unlimited access to all models — perfect for solo developers, freelancers, hobbyists, and personal projects.

how_to_reg How to Subscribe

  1. Visit airouter.ch — Click "Subscribe Now" on the landing page. You'll be redirected to Stripe's secure checkout.
  2. Complete payment — Enter your email and payment details. Stripe handles all billing — we never see your card information.
  3. Receive your API key — Immediately after successful payment, your API key is generated and available in your dashboard. You'll also receive a welcome email.
  4. Set a password — Follow the link in your welcome email or visit the dashboard to set a password for account access.
  5. Start building — Use your API key with any OpenAI-compatible SDK. No further setup required.
check_circle
Already subscribed? Head to your dashboard to view your API key, manage your subscription, or generate additional keys.

visibility What You Get

Unlimited API requests (fair use)
Qwen3.6 + DeepSeek-V4-Flash
262K context window
Qwen3-Embedding access
Vision & image input
OpenAI-compatible API
Streaming support (SSE)
Swiss-hosted, no prompt logging
Reasoning & tool calling
Dashboard & key management

All prices in CHF. Swiss VAT may apply depending on your location. Subscriptions are billed monthly. Cancel anytime — access remains active until the end of the current billing period.

Teams / Business

For teams and organizations that need multi-seat access, consolidated billing, and custom pricing. Business accounts are invitation-only — contact us to get started.

business Features

👥 Multi-seat

One account, multiple API keys

📄 Custom invoices

Tailored to your billing

📊 Volume discounts

Scales with team size

Contact support@airouter.ch to discuss your requirements.

Cancellation

You can cancel your subscription at any time via the Stripe customer portal, accessible from your dashboard. After cancellation, access remains active until the end of the current billing period.

If you need assistance, reach out to support@airouter.ch.