switchailocal
Unified LLM proxy for AI agents. Route all model requests through http://localhost:18080/v1. Provides FREE access to Gemini CLI, Claude CLI, Codex, and Vibe via your existing subscriptions. Use when: (1) making LLM calls using provider prefixes, (2) switching between CLI/Local/Cloud providers, (3) needing to attach local files/folders to prompts via CLI, (4) requiring intelligent routing between models, or (5) needing to monitor provider health and analytics.
Best use case
switchailocal is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Unified LLM proxy for AI agents. Route all model requests through http://localhost:18080/v1. Provides FREE access to Gemini CLI, Claude CLI, Codex, and Vibe via your existing subscriptions. Use when: (1) making LLM calls using provider prefixes, (2) switching between CLI/Local/Cloud providers, (3) needing to attach local files/folders to prompts via CLI, (4) requiring intelligent routing between models, or (5) needing to monitor provider health and analytics.
Teams using switchailocal should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/switchailocal/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How switchailocal Compares
| Feature / Agent | switchailocal | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Unified LLM proxy for AI agents. Route all model requests through http://localhost:18080/v1. Provides FREE access to Gemini CLI, Claude CLI, Codex, and Vibe via your existing subscriptions. Use when: (1) making LLM calls using provider prefixes, (2) switching between CLI/Local/Cloud providers, (3) needing to attach local files/folders to prompts via CLI, (4) requiring intelligent routing between models, or (5) needing to monitor provider health and analytics.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# switchAILocal Proxy
Unified LLM proxy for AI agents. Always use `http://localhost:18080/v1` as your base URL.
**The killer feature**: Use your paid CLI subscriptions (Gemini Pro, Claude Pro, etc.) via the API - **it's FREE** because you already pay for the subscription!
---
## Quick Start
### 1. Make a request (FREE with CLI)
```bash
curl http://localhost:18080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "geminicli:",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```
### 2. Configure Python Client
```python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:18080/v1", api_key="sk-test-123")
response = client.chat.completions.create(model="geminicli:", messages=[{"role": "user", "content": "Hi!"}])
```
---
## 🗺️ Skill Files
| File | Description |
| ------------------------------------------------------------ | ------------------------------------ |
| **SKILL.md** (this file) | Core workflow and endpoint reference |
| [references/routing.md](references/routing.md) | Intelligent routing and matrix setup |
| [references/multimodal.md](references/multimodal.md) | Vision and image processing |
| [references/examples.md](references/examples.md) | Real-world agentic use cases |
| [references/management-api.md](references/management-api.md) | Full Monitoring & Operations API |
| [references/steering.md](references/steering.md) | Conditional routing rules |
| [references/hooks.md](references/hooks.md) | Automation and event hooks |
| [references/memory.md](references/memory.md) | Analytics and history |
---
## ⚠️ Critical: Model Format
**NEVER use bare model names.** Format is ALWAYS `provider:` or `provider:model`.
| ❌ Wrong | ✅ Correct | Why |
| ------------------- | -------------------------- | ------------------------- |
| `gemini-2.5-pro` | `geminicli:gemini-2.5-pro` | Needs provider prefix |
| `claude-3-5-sonnet` | `claudecli:` | `claudecli:` uses default |
| `llama3` | `ollama:llama3` | Needs provider prefix |
---
## 🏗️ Provider Reference
### 1. CLI Providers (FREE!)
Uses your human's CLI subscriptions. Best for agents.
| Prefix | CLI | Subscription Required |
| ------------ | -------- | --------------------- |
| `geminicli:` | `gemini` | Google AI Premium/Pro |
| `claudecli:` | `claude` | Claude Pro/Max |
| `codex:` | `codex` | OpenAI Plus |
| `vibe:` | `vibe` | Mistral Le Chat |
### 2. Local & Cloud
| Prefix | Source | Cost |
| ----------- | -------------- | ---------------------- |
| `ollama:` | Local Ollama | FREE |
| `auto` | Local Cortex | FREE (Requires plugin) |
| `switchai:` | Traylinx Cloud | Per-token |
| `groq:` | Groq Cloud | Per-token |
---
## 🚀 Core Features
### CLI Attachments & Flags
Pass local context and control autonomy via CLI extensions.
```json
{
"model": "geminicli:",
"messages": [{"role": "user", "content": "Fix this code"}],
"extra_body": {
"cli": {
"attachments": [{"type": "folder", "path": "./src"}],
"flags": {"auto_approve": true, "yolo": true}
}
}
}
```
### Streaming
Add `"stream": true` to any request for SSE token streaming.
---
## 🌲 Decision Tree
```
What do you need?
├─ FREE + Powerful + Files
│ └─ CLI Providers (geminicli:, claudecli:)
├─ FREE + Private + Fast
│ └─ Local Ollama (ollama:llama3.2)
├─ Ultra-Fast Production
│ └─ Groq Cloud (groq:llama-3.3-70b)
└─ I don't know, you pick
└─ Intelligent Routing (auto)
```
---
## 🛠️ Troubleshooting & Best Practices
| Problem | Fix |
| ---------------- | ---------------------------------------- |
| Connection error | Check if server is running on port 18080 |
| Model not found | Ensure you used the `provider:` prefix |
| 401 Unauthorized | Check API key in `config.yaml` |
### Best Practices
1. **Prefer CLI Providers**: They are free and support file attachments.
2. **Check Status**: Use `GET /v1/providers` to see what is active.
3. **Use `auto`**: For simple tasks, let the router pick the best model.
4. **Local for Privacy**: Use `ollama:` for confidential data.
---
*Route wisely. Save tokens. Use CLI.* 🚀Related Skills
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
synthflow-ai-automation
Automate Synthflow AI tasks via Rube MCP (Composio). Always search tools first for current schemas.
sync-agents
Synchronize GitHub Copilot instructions, custom agents, and skills into detected AI coding agent configurations in this repository. Use when asked to mirror .github/copilot-instructions.md, .github/instructions, .github/agents, or .github/skills into Claude, Codex, Cursor, Gemini, Windsurf, and related tooling.
synapse
Multi-AI Agent Orchestration System with configurable models and role-based workflows. Use when you need to coordinate multiple AI agents (Claude, Gemini, Codex) for complex tasks like planning, code generation, analysis, review, and execution. Supports agentic workflow patterns: parallel specialists, pipeline, and swarm orchestration. Compatible with Claude Code, Cursor, and OpenCode. Triggers: 'orchestrate agents', 'multi-agent workflow', 'plan and execute', 'code review pipeline', 'run synapse', 'agentic workflow'.
synapse-action-development
Explains how to create Synapse plugin actions. Use when the user asks to "create an action", "write an action", uses "@action decorator", "BaseAction class", "function-based action", "class-based action", "Pydantic params", "ActionPipeline", "DataType", "input_type", "output_type", "semantic types", "YOLODataset", "ModelWeights", "pipeline chaining", or needs help with synapse plugin action development.
swot-pestle-analysis
Strategic environmental analysis using SWOT, PESTLE, and Porter's Five Forces. Creates structured assessments with Mermaid visualizations for competitive positioning and strategic planning.
swift-actor-persistence
Use when building a thread-safe data persistence layer in Swift using actors with in-memory cache and file storage.
swamp-vault
Manage swamp vaults for secure secret storage. Use when creating vaults, storing secrets, retrieving secrets, listing vault keys, or working with vault expressions in workflows. Triggers on "vault", "secret", "secrets", "credentials", "api key storage", "secure storage", "password", "token", "key management", "sensitive data", "encrypt", "aws secrets manager", "store secret", "put secret", "get secret", "credential storage", or vault-related CLI commands.
supadata-automation
Automate Supadata tasks via Rube MCP (Composio). Always search tools first for current schemas.
Suno Song Creator
This skill should be used when the user asks to "create a Suno prompt", "write a Suno song", "generate music with Suno", "help me with Suno", "make a song prompt", "create lyrics for Suno", "build a music prompt", or mentions Suno AI music generation. Provides comprehensive guidance for creating professional Suno prompts using advanced prompting strategies, structured formatting within 1000 character limit (NO blank lines between sections), parameter optimization, genre-specific techniques, interactive questioning with efficient project name collection, automated artist/song research via sub-agent (web fetching + pattern extraction), automatic file export to organized project directories, AI-slop avoidance for authentic human-centered lyrics, copyright-safe style descriptions that avoid artist/album/song names, character counting utilities for accurate verification, and optional independent quality review via sub-agent for professional assessment.
summarize
Summarize or extract text/transcripts from URLs, podcasts, and local files.
subgraph-explorer
Explore and query blockchain subgraphs through a private MCP server running in Docker. Use this skill when exploring GraphQL subgraphs, querying blockchain data from subgraphs (NFT transfers, DEX swaps, DeFi metrics), examining subgraph schemas, or exporting discovered queries for project use. The skill manages Docker-based MCP server interaction and provides utilities for query development and export.