multi-llm
Multi-LLM intelligent switching. Use command 'multi llm' to activate local model selection based on task type. Default uses Claude Opus 4.5.
Best use case
multi-llm is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Multi-LLM intelligent switching. Use command 'multi llm' to activate local model selection based on task type. Default uses Claude Opus 4.5.
Teams using multi-llm should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/multi-llm/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How multi-llm Compares
| Feature / Agent | multi-llm | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Multi-LLM intelligent switching. Use command 'multi llm' to activate local model selection based on task type. Default uses Claude Opus 4.5.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Multi-LLM - Intelligent Model Switching
**Trigger Command**: `multi llm`
> **Default Behavior**: Always use Claude Opus 4.5 (strongest model)
> Only when the message contains `multi llm` command will local model selection be activated.
## What's New in v1.1.0
- Renamed trigger from `mlti llm` to `multi llm` (clearer naming)
- Enhanced model existence checking with fallback chain
- Added detailed usage examples and troubleshooting
- Improved task detection patterns
## Usage
### Default Mode (without command)
```
Help me write a Python function -> Uses Claude Opus 4.5
Analyze this code -> Uses Claude Opus 4.5
```
### Multi-Model Mode (with command)
```
multi llm Help me write a Python function -> Selects qwen2.5-coder:32b
multi llm Analyze this math proof -> Selects deepseek-r1:70b
multi llm Translate to Chinese -> Selects glm4:9b
```
## Command Format
| Command | Description |
|---------|-------------|
| `multi llm` | Activate intelligent model selection |
| `multi llm coding` | Force coding model |
| `multi llm reasoning` | Force reasoning model |
| `multi llm chinese` | Force Chinese model |
| `multi llm general` | Force general model |
## Model Mapping
**Primary Model (Default)**: github-copilot/claude-opus-4.5
**Local Models (when `multi llm` triggered)**:
| Task Type | Model | Size | Best For |
|-----------|-------|------|----------|
| Coding | qwen2.5-coder:32b | 19GB | Code generation, debugging, refactoring |
| Reasoning | deepseek-r1:70b | 42GB | Math, logic, complex analysis |
| Chinese | glm4:9b | 5.5GB | Translation, summaries, quick tasks |
| General | qwen3:32b | 20GB | General purpose, fallback |
### Fallback Chain
If the selected model is unavailable, the system tries alternatives:
```
Coding: qwen2.5-coder:32b -> qwen2.5-coder:14b -> qwen3:32b
Reasoning: deepseek-r1:70b -> deepseek-r1:32b -> qwen3:32b
Chinese: glm4:9b -> qwen3:8b -> qwen3:32b
General: qwen3:32b -> qwen3:14b -> qwen3:8b
```
## Detection Logic
```
User Input
|
v
Contains "multi llm"?
|
+-- No -> Use Claude Opus 4.5 (default)
|
+-- Yes -> Task Type Detection
|
+-------+-------+-------+
v v v v
Coding Reasoning Chinese General
| | | |
v v v v
qwen2.5 deepseek glm4 qwen3
coder r1:70b :9b :32b
```
### Task Detection Keywords
| Category | Keywords (EN) | Keywords (CN) |
|----------|---------------|---------------|
| Coding | code, debug, function, script, api, bug, refactor, python, java, javascript | 代码, 编程, 函数, 调试, 重构 |
| Reasoning | analysis, proof, logic, math, solve, algorithm, evaluate | 推理, 分析, 证明, 逻辑, 数学, 计算, 算法 |
| Chinese | translate, summary | 翻译, 总结, 摘要, 简单, 快速 |
## Examples
### Example 1: Coding Task
```bash
# Input
multi llm Write a Python function to calculate fibonacci
# Output
Selected: qwen2.5-coder:32b
Reason: Detected coding task (keywords: python, function)
```
### Example 2: Math Analysis
```bash
# Input
multi llm reasoning Prove that sqrt(2) is irrational
# Output
Selected: deepseek-r1:70b
Reason: Force command 'reasoning' used
```
### Example 3: Quick Translation
```bash
# Input
multi llm 把这段话翻译成英文
# Output
Selected: glm4:9b
Reason: Detected Chinese lightweight task (keywords: 翻译)
```
### Example 4: Default (No trigger)
```bash
# Input
Write a REST API with authentication
# Output
Selected: claude-opus-4.5
Reason: Default model (no 'multi llm' trigger)
```
## Prerequisites
1. **Ollama** must be installed and running:
```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Start Ollama service
ollama serve
# Pull required models
ollama pull qwen2.5-coder:32b
ollama pull deepseek-r1:70b
ollama pull glm4:9b
ollama pull qwen3:32b
```
2. **Check available models**:
```bash
ollama list
```
## Troubleshooting
### Model not found
```bash
# Check if model exists
ollama list | grep "qwen2.5-coder"
# Pull missing model
ollama pull qwen2.5-coder:32b
```
### Ollama not running
```bash
# Check service status
curl -s http://localhost:11434/api/tags
# Start Ollama
ollama serve &
```
### Slow response
- Large models (70b) require significant RAM/VRAM
- Consider using smaller variants: `deepseek-r1:32b` instead of `70b`
### Wrong model selected
- Use force commands: `multi llm coding`, `multi llm reasoning`
- Check if keywords match your task type
## Files in This Skill
```
multi-llm/
├── SKILL.md # This documentation
└── scripts/
├── select-model.sh # Model selection logic
└── fallback-demo.sh # Interactive demo script
```
## Integration
### With OpenCode/ClaudeCode
The trigger `multi llm` is detected in your message. Simply prefix your request:
```
multi llm [your request here]
```
### Programmatic Usage
```bash
# Get recommended model for a task
./scripts/select-model.sh "multi llm write a sorting algorithm"
# Output: qwen2.5-coder:32b
# Demo with actual model call
./scripts/fallback-demo.sh --force-local "explain recursion"
```
## Author
- GitHub: [@leohan123123](https://github.com/leohan123123)
## License
MITRelated Skills
portfolio-watcher
Monitor stock/crypto holdings, get price alerts, track portfolio performance
portainer
Control Docker containers and stacks via Portainer API. List containers, start/stop/restart, view logs, and redeploy stacks from git.
portable-tools
Build cross-device tools without hardcoding paths or account names
polymarket
Trade prediction markets on Polymarket. Analyze odds, place bets, track positions, automate alerts, and maximize returns from event outcomes. Covers sports, politics, entertainment, and more.
polymarket-traiding-bot
No description provided.
polymarket-analysis
Analyze Polymarket prediction markets for trading edges. Pair Cost arbitrage, whale tracking, sentiment analysis, momentum signals, user profile tracking. No execution.
polymarket-agent
Autonomous prediction market agent - analyzes markets, researches news, and identifies trading opportunities
polymarket-5
Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.
polymarket-4
Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.
polymarket-3
Query Polymarket prediction market odds and events via CLI. Search for markets, get current prices, list events by category. Supports sports betting (NFL, NBA, soccer/EPL, Champions League), politics, crypto, elections, geopolitics. Real money markets = more accurate than polls. No API key required. Use when asked about odds, probabilities, predictions, or "what are the chances of X".
polymarket-2
Query Polymarket prediction markets - check odds, trending markets, search events, track prices.
pollinations
Pollinations.ai API for AI generation - text, images, videos, audio, and analysis. Use when user requests AI-powered generation (text completion, images, videos, audio, vision/analysis, transcription) or mentions Pollinations. Supports 25+ models (OpenAI, Claude, Gemini, Flux, Veo, etc.) with OpenAI-compatible chat endpoint and specialized generation endpoints.