openrouter-cost-controls

Implement cost controls for OpenRouter API usage. Use when setting budgets, preventing overspend, or managing per-key limits. Triggers: 'openrouter budget', 'openrouter cost limit', 'openrouter spending', 'control openrouter cost'.

1,868 stars

byjeremylongshore

View on GitHub Installation ↓

Best use case

openrouter-cost-controls is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using openrouter-cost-controls should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/openrouter-cost-controls/SKILL.md --create-dirs "https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/main/plugins/saas-packs/openrouter-pack/skills/openrouter-cost-controls/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/openrouter-cost-controls/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How openrouter-cost-controls Compares

Feature / Agent	openrouter-cost-controls	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# OpenRouter Cost Controls

## Overview

OpenRouter provides per-key credit limits, a credit balance API, and per-generation cost queries. Combined with client-side budget middleware, you can enforce hard spending caps at the key level and soft caps in your application. This skill covers key-level limits, per-request cost tracking, budget enforcement middleware, and alert systems.

## Check Credit Balance

```bash
# Current balance and limits
curl -s https://openrouter.ai/api/v1/auth/key \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" | jq '{
    credits_used: .data.usage,
    credit_limit: .data.limit,
    remaining: ((.data.limit // 0) - .data.usage),
    is_free_tier: .data.is_free_tier,
    rate_limit: .data.rate_limit
  }'
```

## Per-Key Credit Limits

```python
import os, requests

MGMT_KEY = os.environ["OPENROUTER_MGMT_KEY"]  # Management key

# Create a key with a $50 credit limit
resp = requests.post(
    "https://openrouter.ai/api/v1/keys",
    headers={"Authorization": f"Bearer {MGMT_KEY}"},
    json={"name": "backend-prod", "limit": 50.0},
)
new_key = resp.json()["data"]["key"]  # sk-or-v1-...

# List all keys with their limits and usage
keys = requests.get(
    "https://openrouter.ai/api/v1/keys",
    headers={"Authorization": f"Bearer {MGMT_KEY}"},
).json()
for k in keys.get("data", []):
    print(f"{k['name']}: ${k.get('usage', 0):.4f} / ${k.get('limit', 'unlimited')}")
```

## Budget Enforcement Middleware

```python
import os, time, requests
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"],
    default_headers={"HTTP-Referer": "https://my-app.com", "X-Title": "my-app"},
)

class BudgetEnforcer:
    """Client-side budget enforcement with server-side cost verification."""

    def __init__(self, daily_limit: float = 10.0, per_request_limit: float = 0.50):
        self.daily_limit = daily_limit
        self.per_request_limit = per_request_limit
        self._daily_spend = 0.0
        self._day = time.strftime("%Y-%m-%d")

    def _reset_if_new_day(self):
        today = time.strftime("%Y-%m-%d")
        if today != self._day:
            self._daily_spend = 0.0
            self._day = today

    def estimate_cost(self, model: str, prompt_tokens: int, max_tokens: int) -> float:
        """Pre-flight cost estimate using cached pricing."""
        # Representative rates (fetch from /models in production)
        RATES = {
            "anthropic/claude-3.5-sonnet": (3.0, 15.0),    # per 1M tokens
            "openai/gpt-4o": (2.50, 10.0),
            "openai/gpt-4o-mini": (0.15, 0.60),
            "meta-llama/llama-3.1-8b-instruct": (0.06, 0.06),
        }
        prompt_rate, comp_rate = RATES.get(model, (3.0, 15.0))
        return (prompt_tokens * prompt_rate / 1_000_000) + (max_tokens * comp_rate / 1_000_000)

    def check_budget(self, model: str, prompt_tokens: int, max_tokens: int):
        """Raise if request would exceed budget."""
        self._reset_if_new_day()
        estimated = self.estimate_cost(model, prompt_tokens, max_tokens)

        if estimated > self.per_request_limit:
            raise ValueError(
                f"Request estimated at ${estimated:.4f} exceeds per-request limit ${self.per_request_limit}"
            )
        if self._daily_spend + estimated > self.daily_limit:
            raise ValueError(
                f"Daily spend ${self._daily_spend:.4f} + request ${estimated:.4f} "
                f"exceeds daily limit ${self.daily_limit}"
            )

    def record_cost(self, generation_id: str):
        """Record actual cost from generation endpoint."""
        try:
            gen = requests.get(
                f"https://openrouter.ai/api/v1/generation?id={generation_id}",
                headers={"Authorization": f"Bearer {os.environ['OPENROUTER_API_KEY']}"},
                timeout=5,
            ).json()
            cost = float(gen.get("data", {}).get("total_cost", 0))
            self._daily_spend += cost
            return cost
        except Exception:
            return 0.0

budget = BudgetEnforcer(daily_limit=25.0, per_request_limit=1.0)
```

## Cost-Saving Model Variants

```python
# :floor variant -- cheapest provider for a model
response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet:floor",  # Cheapest provider
    messages=[{"role": "user", "content": "Summarize this..."}],
    max_tokens=500,
)

# :free variant -- free providers (where available)
response = client.chat.completions.create(
    model="google/gemma-2-9b-it:free",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)

# Route simple tasks to cheap models
ROUTING = {
    "classification": "openai/gpt-4o-mini",      # $0.15/$0.60 per 1M
    "summarization": "anthropic/claude-3-haiku",  # $0.25/$1.25 per 1M
    "code_generation": "anthropic/claude-3.5-sonnet",  # $3/$15 per 1M
    "simple_qa": "meta-llama/llama-3.1-8b-instruct",  # $0.06/$0.06 per 1M
}
```

## Budget Alert Script

```bash
#!/bin/bash
# Alert when credits drop below threshold
THRESHOLD=5.0

REMAINING=$(curl -s https://openrouter.ai/api/v1/auth/key \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" | \
  jq '((.data.limit // 0) - .data.usage)')

if (( $(echo "$REMAINING < $THRESHOLD" | bc -l) )); then
  echo "ALERT: OpenRouter credits low: \$$REMAINING remaining"
  # Send to Slack, PagerDuty, etc.
fi
```

## Error Handling

| Error | Cause | Fix |
|-------|-------|-----|
| 402 Payment Required | Credits exhausted | Top up at openrouter.ai/credits or use `:free` model |
| 402 Key limit reached | Per-key credit limit hit | Increase key limit or create new key |
| Budget middleware rejects | Client-side limit exceeded | Increase limit or optimize prompt tokens |
| Stale pricing data | Cached rates outdated | Refresh from `/api/v1/models` daily |

## Enterprise Considerations

- Set per-key credit limits via management API to isolate blast radius per service/team
- Query `/api/v1/generation?id=` after each request for exact cost auditing
- Use `:floor` variant to automatically pick the cheapest provider for a model
- Route simple tasks to budget models ($0.06/1M) and reserve premium models for complex tasks
- Set `max_tokens` on every request to cap completion cost
- Enable auto-topup in the dashboard to prevent production service interruptions

## References

- [Examples](${CLAUDE_SKILL_DIR}/references/examples.md) | [Errors](${CLAUDE_SKILL_DIR}/references/errors.md)
- [Credits](https://openrouter.ai/credits) | [Key Provisioning](https://openrouter.ai/docs/guides/overview/auth/provisioning-api-keys)

Related Skills

workhuman-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Workhuman cost tuning for employee recognition and rewards API. Use when integrating Workhuman Social Recognition, or building recognition workflows with HRIS systems. Trigger: "workhuman cost tuning".

wispr-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Wispr Flow cost tuning for voice-to-text API integration. Use when integrating Wispr Flow dictation, WebSocket streaming, or building voice-powered applications. Trigger: "wispr cost tuning".

windsurf-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Optimize Windsurf licensing costs through seat management, tier selection, and credit monitoring. Use when analyzing Windsurf billing, reducing per-seat costs, or implementing usage monitoring and budget controls. Trigger with phrases like "windsurf cost", "windsurf billing", "reduce windsurf costs", "windsurf pricing", "windsurf budget".

webflow-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Optimize Webflow costs through plan selection, CDN read optimization, bulk endpoint usage, and API usage monitoring with budget alerts. Use when analyzing Webflow billing, reducing API costs, or implementing usage monitoring for Webflow integrations. Trigger with phrases like "webflow cost", "webflow billing", "reduce webflow costs", "webflow pricing", "webflow budget".

vercel-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Optimize Vercel costs through plan selection, function efficiency, and usage monitoring. Use when analyzing Vercel billing, reducing function execution costs, or implementing spend management and budget alerts. Trigger with phrases like "vercel cost", "vercel billing", "reduce vercel costs", "vercel pricing", "vercel expensive", "vercel budget".

veeva-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Veeva Vault cost tuning for REST API and clinical operations. Use when working with Veeva Vault document management and CRM. Trigger: "veeva cost tuning".

vastai-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Optimize Vast.ai GPU cloud costs through smart instance selection and lifecycle management. Use when analyzing GPU spending, reducing training costs, or implementing budget controls for Vast.ai workloads. Trigger with phrases like "vastai cost", "vastai billing", "reduce vastai costs", "vastai pricing", "vastai budget".

twinmind-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Optimize TwinMind costs across Free, Pro ($10/mo), and Enterprise tiers with usage monitoring and tier selection guidance. Use when implementing cost tuning, or managing TwinMind meeting AI operations. Trigger with phrases like "twinmind cost tuning", "twinmind cost tuning".

together-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Together AI cost tuning for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together cost tuning".

techsmith-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

TechSmith cost tuning for Snagit COM API and Camtasia automation. Use when working with TechSmith screen capture and video editing automation. Trigger: "techsmith cost tuning".

supabase-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

Optimize Supabase costs through plan selection, database tuning, storage cleanup, connection pooling, and Edge Function optimization. Use when analyzing Supabase billing, reducing costs, right-sizing compute, or implementing usage tracking and budget alerts. Trigger with phrases like "supabase cost", "supabase billing", "reduce supabase costs", "supabase pricing", "supabase expensive", "supabase budget".

stackblitz-cost-tuning

1868

from jeremylongshore/claude-code-plugins-plus-skills

StackBlitz pricing tiers: free embedding, WebContainer API commercial licensing. Use when working with WebContainers or StackBlitz SDK. Trigger: "stackblitz cost".