multiAI Summary Pending
llm-router
Unified LLM Gateway - One API for 70+ AI models. Route to GPT, Claude, Gemini, Qwen, Deepseek, Grok and more with a single API key.
3,556 stars
byopenclaw
Installation
Claude Code / Cursor / Codex
$curl -o ~/.claude/skills/openclaw-aisa-llm-gateway/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/0xjordansg-yolo/openclaw-aisa-llm-gateway/SKILL.md"
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/openclaw-aisa-llm-gateway/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How llm-router Compares
| Feature / Agent | llm-router | Standard Approach |
|---|---|---|
| Platform Support | multi | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Unified LLM Gateway - One API for 70+ AI models. Route to GPT, Claude, Gemini, Qwen, Deepseek, Grok and more with a single API key.
Which AI agents support this skill?
This skill is compatible with multi.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# OpenClaw LLM Router 🧠
**Unified LLM Gateway for autonomous agents. Powered by AIsa.**
One API key. 70+ models. OpenAI-compatible.
Replace 100+ API keys with one. Access GPT-4, Claude-3, Gemini, Qwen, Deepseek, Grok, and more through a unified, OpenAI-compatible API.
## 🔥 What Can You Do?
### Multi-Model Chat
```
"Chat with GPT-4 for reasoning, switch to Claude for creative writing"
```
### Model Comparison
```
"Compare responses from GPT-4, Claude, and Gemini for the same question"
```
### Vision Analysis
```
"Analyze this image with GPT-4o - what objects are in it?"
```
### Cost Optimization
```
"Route simple queries to fast/cheap models, complex queries to GPT-4"
```
### Fallback Strategy
```
"If GPT-4 fails, automatically try Claude, then Gemini"
```
## Why LLM Router?
| Feature | LLM Router | Direct APIs |
|---------|------------|-------------|
| API Keys | 1 | 10+ |
| SDK Compatibility | OpenAI SDK | Multiple SDKs |
| Billing | Unified | Per-provider |
| Model Switching | Change string | Code rewrite |
| Fallback Routing | Built-in | DIY |
| Cost Tracking | Unified | Fragmented |
## Supported Model Families
| Family | Developer | Example Models |
|--------|-----------|----------------|
| GPT | OpenAI | gpt-4.1, gpt-4o, gpt-4o-mini, o1, o1-mini, o3-mini |
| Claude | Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-sonnet |
| Gemini | Google | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash |
| Qwen | Alibaba | qwen-max, qwen-plus, qwen2.5-72b-instruct |
| Deepseek | Deepseek | deepseek-chat, deepseek-coder, deepseek-v3, deepseek-r1 |
| Grok | xAI | grok-2, grok-beta |
> **Note**: Model availability may vary. Check [marketplace.aisa.one/pricing](https://marketplace.aisa.one/pricing) for the full list of currently available models and pricing.
## Quick Start
```bash
export AISA_API_KEY="your-key"
```
## API Endpoints
### OpenAI-Compatible Chat Completions
```
POST https://api.aisa.one/v1/chat/completions
```
#### Request
```bash
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"max_tokens": 1000
}'
```
#### Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `model` | string | Yes | Model identifier (e.g., `gpt-4.1`, `claude-3-sonnet`) |
| `messages` | array | Yes | Conversation messages |
| `temperature` | number | No | Randomness (0-2, default: 1) |
| `max_tokens` | integer | No | Maximum response tokens |
| `stream` | boolean | No | Enable streaming (default: false) |
| `top_p` | number | No | Nucleus sampling (0-1) |
| `frequency_penalty` | number | No | Frequency penalty (-2 to 2) |
| `presence_penalty` | number | No | Presence penalty (-2 to 2) |
| `stop` | string/array | No | Stop sequences |
#### Message Format
```json
{
"role": "user|assistant|system",
"content": "message text or array for multimodal"
}
```
#### Response
```json
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1234567890,
"model": "gpt-4.1",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 50,
"completion_tokens": 200,
"total_tokens": 250,
"cost": 0.0025
}
}
```
### Streaming Response
```bash
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet",
"messages": [{"role": "user", "content": "Write a poem about AI."}],
"stream": true
}'
```
Streaming returns Server-Sent Events (SSE):
```
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"In"}}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" circuits"}}]}
...
data: [DONE]
```
### Vision / Image Analysis
Analyze images by passing image URLs or base64 data:
```bash
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}
]
}'
```
### Function Calling
Enable tools/functions for structured outputs:
```bash
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
"functions": [
{
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
],
"function_call": "auto"
}'
```
### Google Gemini Format
For Gemini models, you can also use the native format:
```
POST https://api.aisa.one/v1/models/{model}:generateContent
```
```bash
curl -X POST "https://api.aisa.one/v1/models/gemini-2.0-flash:generateContent" \
-H "Authorization: Bearer $AISA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "Explain machine learning."}]
}
],
"generationConfig": {
"temperature": 0.7,
"maxOutputTokens": 1000
}
}'
```
## Python Client
### Installation
No installation required - uses standard library only.
### CLI Usage
```bash
# Basic completion
python3 {baseDir}/scripts/llm_router_client.py chat --model gpt-4.1 --message "Hello, world!"
# With system prompt
python3 {baseDir}/scripts/llm_router_client.py chat --model claude-3-sonnet --system "You are a poet" --message "Write about the moon"
# Streaming
python3 {baseDir}/scripts/llm_router_client.py chat --model gpt-4o --message "Tell me a story" --stream
# Multi-turn conversation
python3 {baseDir}/scripts/llm_router_client.py chat --model qwen-max --messages '[{"role":"user","content":"Hi"},{"role":"assistant","content":"Hello!"},{"role":"user","content":"How are you?"}]'
# Vision analysis
python3 {baseDir}/scripts/llm_router_client.py vision --model gpt-4o --image "https://example.com/image.jpg" --prompt "Describe this image"
# List supported models
python3 {baseDir}/scripts/llm_router_client.py models
# Compare models
python3 {baseDir}/scripts/llm_router_client.py compare --models "gpt-4.1,claude-3-sonnet,gemini-2.0-flash" --message "What is 2+2?"
```
### Python SDK Usage
```python
from llm_router_client import LLMRouterClient
client = LLMRouterClient() # Uses AISA_API_KEY env var
# Simple chat
response = client.chat(
model="gpt-4.1",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response["choices"][0]["message"]["content"])
# With options
response = client.chat(
model="claude-3-sonnet",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain relativity."}
],
temperature=0.7,
max_tokens=500
)
# Streaming
for chunk in client.chat_stream(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a story."}]
):
print(chunk, end="", flush=True)
# Vision
response = client.vision(
model="gpt-4o",
image_url="https://example.com/image.jpg",
prompt="What's in this image?"
)
# Compare models
results = client.compare_models(
models=["gpt-4.1", "claude-3-sonnet", "gemini-2.0-flash"],
message="Explain quantum computing"
)
for model, result in results.items():
print(f"{model}: {result['response'][:100]}...")
```
## Use Cases
### 1. Cost-Optimized Routing
Use cheaper models for simple tasks:
```python
def smart_route(message: str) -> str:
# Simple queries -> fast/cheap model
if len(message) < 50:
model = "gpt-3.5-turbo"
# Complex reasoning -> powerful model
else:
model = "gpt-4.1"
return client.chat(model=model, messages=[{"role": "user", "content": message}])
```
### 2. Fallback Strategy
Automatic fallback on failure:
```python
def chat_with_fallback(message: str) -> str:
models = ["gpt-4.1", "claude-3-sonnet", "gemini-2.0-flash"]
for model in models:
try:
return client.chat(model=model, messages=[{"role": "user", "content": message}])
except Exception:
continue
raise Exception("All models failed")
```
### 3. Model A/B Testing
Compare model outputs:
```python
results = client.compare_models(
models=["gpt-4.1", "claude-3-opus"],
message="Analyze this quarterly report..."
)
# Log for analysis
for model, result in results.items():
log_response(model=model, latency=result["latency"], cost=result["cost"])
```
### 4. Specialized Model Selection
Choose the best model for each task:
```python
MODEL_MAP = {
"code": "deepseek-coder",
"creative": "claude-3-opus",
"fast": "gpt-3.5-turbo",
"vision": "gpt-4o",
"chinese": "qwen-max",
"reasoning": "gpt-4.1"
}
def route_by_task(task_type: str, message: str) -> str:
model = MODEL_MAP.get(task_type, "gpt-4.1")
return client.chat(model=model, messages=[{"role": "user", "content": message}])
```
## Error Handling
Errors return JSON with `error` field:
```json
{
"error": {
"code": "model_not_found",
"message": "Model 'xyz' is not available"
}
}
```
Common error codes:
- `401` - Invalid or missing API key
- `402` - Insufficient credits
- `404` - Model not found
- `429` - Rate limit exceeded
- `500` - Server error
## Best Practices
1. **Use streaming** for long responses to improve UX
2. **Set max_tokens** to control costs
3. **Implement fallback** for production reliability
4. **Cache responses** for repeated queries
5. **Monitor usage** via response metadata
6. **Use appropriate models** - don't use GPT-4 for simple tasks
## OpenAI SDK Compatibility
Just change the base URL and key:
```python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["AISA_API_KEY"],
base_url="https://api.aisa.one/v1"
)
response = client.chat.completions.create(
model="gpt-4.1",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
```
## Pricing
Token-based pricing varies by model. Check [marketplace.aisa.one/pricing](https://marketplace.aisa.one/pricing) for current rates.
| Model Family | Approximate Cost |
|--------------|------------------|
| GPT-4.1 / GPT-4o | ~$0.01 / 1K tokens |
| Claude-3-Sonnet | ~$0.01 / 1K tokens |
| Gemini-2.0-Flash | ~$0.001 / 1K tokens |
| Qwen-Max | ~$0.005 / 1K tokens |
| DeepSeek-V3 | ~$0.002 / 1K tokens |
Every response includes `usage.cost` and `usage.credits_remaining`.
## Get Started
1. Sign up at [aisa.one](https://aisa.one)
2. Get your API key from the dashboard
3. Add credits (pay-as-you-go)
4. Set environment variable: `export AISA_API_KEY="your-key"`
## Full API Reference
See [API Reference](https://aisa.mintlify.app/api-reference/introduction) for complete endpoint documentation.