openrouter-cost-controls
Implement cost controls for OpenRouter API usage. Use when setting budgets, preventing overspend, or managing per-key limits. Triggers: 'openrouter budget', 'openrouter cost limit', 'openrouter spending', 'control openrouter cost'.
Best use case
openrouter-cost-controls is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Implement cost controls for OpenRouter API usage. Use when setting budgets, preventing overspend, or managing per-key limits. Triggers: 'openrouter budget', 'openrouter cost limit', 'openrouter spending', 'control openrouter cost'.
Teams using openrouter-cost-controls should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/openrouter-cost-controls/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How openrouter-cost-controls Compares
| Feature / Agent | openrouter-cost-controls | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Implement cost controls for OpenRouter API usage. Use when setting budgets, preventing overspend, or managing per-key limits. Triggers: 'openrouter budget', 'openrouter cost limit', 'openrouter spending', 'control openrouter cost'.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# OpenRouter Cost Controls
## Overview
OpenRouter provides per-key credit limits, a credit balance API, and per-generation cost queries. Combined with client-side budget middleware, you can enforce hard spending caps at the key level and soft caps in your application. This skill covers key-level limits, per-request cost tracking, budget enforcement middleware, and alert systems.
## Check Credit Balance
```bash
# Current balance and limits
curl -s https://openrouter.ai/api/v1/auth/key \
-H "Authorization: Bearer $OPENROUTER_API_KEY" | jq '{
credits_used: .data.usage,
credit_limit: .data.limit,
remaining: ((.data.limit // 0) - .data.usage),
is_free_tier: .data.is_free_tier,
rate_limit: .data.rate_limit
}'
```
## Per-Key Credit Limits
```python
import os, requests
MGMT_KEY = os.environ["OPENROUTER_MGMT_KEY"] # Management key
# Create a key with a $50 credit limit
resp = requests.post(
"https://openrouter.ai/api/v1/keys",
headers={"Authorization": f"Bearer {MGMT_KEY}"},
json={"name": "backend-prod", "limit": 50.0},
)
new_key = resp.json()["data"]["key"] # sk-or-v1-...
# List all keys with their limits and usage
keys = requests.get(
"https://openrouter.ai/api/v1/keys",
headers={"Authorization": f"Bearer {MGMT_KEY}"},
).json()
for k in keys.get("data", []):
print(f"{k['name']}: ${k.get('usage', 0):.4f} / ${k.get('limit', 'unlimited')}")
```
## Budget Enforcement Middleware
```python
import os, time, requests
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=os.environ["OPENROUTER_API_KEY"],
default_headers={"HTTP-Referer": "https://my-app.com", "X-Title": "my-app"},
)
class BudgetEnforcer:
"""Client-side budget enforcement with server-side cost verification."""
def __init__(self, daily_limit: float = 10.0, per_request_limit: float = 0.50):
self.daily_limit = daily_limit
self.per_request_limit = per_request_limit
self._daily_spend = 0.0
self._day = time.strftime("%Y-%m-%d")
def _reset_if_new_day(self):
today = time.strftime("%Y-%m-%d")
if today != self._day:
self._daily_spend = 0.0
self._day = today
def estimate_cost(self, model: str, prompt_tokens: int, max_tokens: int) -> float:
"""Pre-flight cost estimate using cached pricing."""
# Representative rates (fetch from /models in production)
RATES = {
"anthropic/claude-3.5-sonnet": (3.0, 15.0), # per 1M tokens
"openai/gpt-4o": (2.50, 10.0),
"openai/gpt-4o-mini": (0.15, 0.60),
"meta-llama/llama-3.1-8b-instruct": (0.06, 0.06),
}
prompt_rate, comp_rate = RATES.get(model, (3.0, 15.0))
return (prompt_tokens * prompt_rate / 1_000_000) + (max_tokens * comp_rate / 1_000_000)
def check_budget(self, model: str, prompt_tokens: int, max_tokens: int):
"""Raise if request would exceed budget."""
self._reset_if_new_day()
estimated = self.estimate_cost(model, prompt_tokens, max_tokens)
if estimated > self.per_request_limit:
raise ValueError(
f"Request estimated at ${estimated:.4f} exceeds per-request limit ${self.per_request_limit}"
)
if self._daily_spend + estimated > self.daily_limit:
raise ValueError(
f"Daily spend ${self._daily_spend:.4f} + request ${estimated:.4f} "
f"exceeds daily limit ${self.daily_limit}"
)
def record_cost(self, generation_id: str):
"""Record actual cost from generation endpoint."""
try:
gen = requests.get(
f"https://openrouter.ai/api/v1/generation?id={generation_id}",
headers={"Authorization": f"Bearer {os.environ['OPENROUTER_API_KEY']}"},
timeout=5,
).json()
cost = float(gen.get("data", {}).get("total_cost", 0))
self._daily_spend += cost
return cost
except Exception:
return 0.0
budget = BudgetEnforcer(daily_limit=25.0, per_request_limit=1.0)
```
## Cost-Saving Model Variants
```python
# :floor variant -- cheapest provider for a model
response = client.chat.completions.create(
model="anthropic/claude-3.5-sonnet:floor", # Cheapest provider
messages=[{"role": "user", "content": "Summarize this..."}],
max_tokens=500,
)
# :free variant -- free providers (where available)
response = client.chat.completions.create(
model="google/gemma-2-9b-it:free",
messages=[{"role": "user", "content": "Hello"}],
max_tokens=100,
)
# Route simple tasks to cheap models
ROUTING = {
"classification": "openai/gpt-4o-mini", # $0.15/$0.60 per 1M
"summarization": "anthropic/claude-3-haiku", # $0.25/$1.25 per 1M
"code_generation": "anthropic/claude-3.5-sonnet", # $3/$15 per 1M
"simple_qa": "meta-llama/llama-3.1-8b-instruct", # $0.06/$0.06 per 1M
}
```
## Budget Alert Script
```bash
#!/bin/bash
# Alert when credits drop below threshold
THRESHOLD=5.0
REMAINING=$(curl -s https://openrouter.ai/api/v1/auth/key \
-H "Authorization: Bearer $OPENROUTER_API_KEY" | \
jq '((.data.limit // 0) - .data.usage)')
if (( $(echo "$REMAINING < $THRESHOLD" | bc -l) )); then
echo "ALERT: OpenRouter credits low: \$$REMAINING remaining"
# Send to Slack, PagerDuty, etc.
fi
```
## Error Handling
| Error | Cause | Fix |
|-------|-------|-----|
| 402 Payment Required | Credits exhausted | Top up at openrouter.ai/credits or use `:free` model |
| 402 Key limit reached | Per-key credit limit hit | Increase key limit or create new key |
| Budget middleware rejects | Client-side limit exceeded | Increase limit or optimize prompt tokens |
| Stale pricing data | Cached rates outdated | Refresh from `/api/v1/models` daily |
## Enterprise Considerations
- Set per-key credit limits via management API to isolate blast radius per service/team
- Query `/api/v1/generation?id=` after each request for exact cost auditing
- Use `:floor` variant to automatically pick the cheapest provider for a model
- Route simple tasks to budget models ($0.06/1M) and reserve premium models for complex tasks
- Set `max_tokens` on every request to cap completion cost
- Enable auto-topup in the dashboard to prevent production service interruptions
## References
- [Examples](${CLAUDE_SKILL_DIR}/references/examples.md) | [Errors](${CLAUDE_SKILL_DIR}/references/errors.md)
- [Credits](https://openrouter.ai/credits) | [Key Provisioning](https://openrouter.ai/docs/guides/overview/auth/provisioning-api-keys)Related Skills
workhuman-cost-tuning
Workhuman cost tuning for employee recognition and rewards API. Use when integrating Workhuman Social Recognition, or building recognition workflows with HRIS systems. Trigger: "workhuman cost tuning".
wispr-cost-tuning
Wispr Flow cost tuning for voice-to-text API integration. Use when integrating Wispr Flow dictation, WebSocket streaming, or building voice-powered applications. Trigger: "wispr cost tuning".
windsurf-cost-tuning
Optimize Windsurf licensing costs through seat management, tier selection, and credit monitoring. Use when analyzing Windsurf billing, reducing per-seat costs, or implementing usage monitoring and budget controls. Trigger with phrases like "windsurf cost", "windsurf billing", "reduce windsurf costs", "windsurf pricing", "windsurf budget".
webflow-cost-tuning
Optimize Webflow costs through plan selection, CDN read optimization, bulk endpoint usage, and API usage monitoring with budget alerts. Use when analyzing Webflow billing, reducing API costs, or implementing usage monitoring for Webflow integrations. Trigger with phrases like "webflow cost", "webflow billing", "reduce webflow costs", "webflow pricing", "webflow budget".
vercel-cost-tuning
Optimize Vercel costs through plan selection, function efficiency, and usage monitoring. Use when analyzing Vercel billing, reducing function execution costs, or implementing spend management and budget alerts. Trigger with phrases like "vercel cost", "vercel billing", "reduce vercel costs", "vercel pricing", "vercel expensive", "vercel budget".
veeva-cost-tuning
Veeva Vault cost tuning for REST API and clinical operations. Use when working with Veeva Vault document management and CRM. Trigger: "veeva cost tuning".
vastai-cost-tuning
Optimize Vast.ai GPU cloud costs through smart instance selection and lifecycle management. Use when analyzing GPU spending, reducing training costs, or implementing budget controls for Vast.ai workloads. Trigger with phrases like "vastai cost", "vastai billing", "reduce vastai costs", "vastai pricing", "vastai budget".
twinmind-cost-tuning
Optimize TwinMind costs across Free, Pro ($10/mo), and Enterprise tiers with usage monitoring and tier selection guidance. Use when implementing cost tuning, or managing TwinMind meeting AI operations. Trigger with phrases like "twinmind cost tuning", "twinmind cost tuning".
together-cost-tuning
Together AI cost tuning for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together cost tuning".
techsmith-cost-tuning
TechSmith cost tuning for Snagit COM API and Camtasia automation. Use when working with TechSmith screen capture and video editing automation. Trigger: "techsmith cost tuning".
supabase-cost-tuning
Optimize Supabase costs through plan selection, database tuning, storage cleanup, connection pooling, and Edge Function optimization. Use when analyzing Supabase billing, reducing costs, right-sizing compute, or implementing usage tracking and budget alerts. Trigger with phrases like "supabase cost", "supabase billing", "reduce supabase costs", "supabase pricing", "supabase expensive", "supabase budget".
stackblitz-cost-tuning
StackBlitz pricing tiers: free embedding, WebContainer API commercial licensing. Use when working with WebContainers or StackBlitz SDK. Trigger: "stackblitz cost".