throttling-apis

Implement API throttling policies to protect backend services from overload. Use when controlling API request rates. Trigger with phrases like "throttle API", "control request rate", or "add throttling".

1,868 stars

Best use case

throttling-apis is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Implement API throttling policies to protect backend services from overload. Use when controlling API request rates. Trigger with phrases like "throttle API", "control request rate", or "add throttling".

Teams using throttling-apis should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/throttling-apis/SKILL.md --create-dirs "https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/main/plugins/api-development/api-throttling-manager/skills/throttling-apis/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/throttling-apis/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How throttling-apis Compares

Feature / Agentthrottling-apisStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Implement API throttling policies to protect backend services from overload. Use when controlling API request rates. Trigger with phrases like "throttle API", "control request rate", or "add throttling".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Throttling APIs

## Overview

Implement API throttling policies that protect backend services from overload by controlling request concurrency, queue depth, and processing rates. Apply backpressure mechanisms including concurrent request limits, priority queues, circuit breakers, and adaptive throttling that adjusts limits based on real-time backend health metrics.

## Prerequisites

- Middleware-capable web framework (Express, FastAPI, Spring Boot, Gin)
- Redis or in-memory store for distributed throttle state tracking
- Monitoring system exposing backend latency and error rate metrics (Prometheus, CloudWatch)
- Load testing tool (k6, Artillery, wrk) for validating throttle behavior under pressure
- Queue system for request buffering during throttle events (optional: Bull, SQS)

## Instructions

1. Analyze existing route handlers and middleware using Grep and Read to identify endpoints with high latency, database-heavy operations, or external service dependencies that need throttle protection.
2. Implement a concurrency limiter middleware that tracks in-flight requests per endpoint and rejects new requests with 503 Service Unavailable when the concurrent limit is reached.
3. Add priority queue support that classifies requests by API key tier (free, pro, enterprise) and serves higher-tier requests first when approaching throttle limits.
4. Build a circuit breaker for downstream service calls that opens after configurable failure thresholds (e.g., 5 failures in 10 seconds), returning 503 with `Retry-After` during the open state.
5. Configure adaptive throttling that monitors backend response latency percentiles (p95, p99) and automatically reduces concurrency limits when latency exceeds SLO thresholds.
6. Add throttle state headers to all responses: `X-Throttle-Limit`, `X-Throttle-Remaining`, and `X-Throttle-Reset` for client-side awareness.
7. Implement graceful degradation strategies per endpoint: serve cached responses, return partial results, or queue requests for deferred processing.
8. Write load tests that verify throttle engagement at expected thresholds, proper 503 responses with `Retry-After`, and recovery behavior when load subsides.

See `${CLAUDE_SKILL_DIR}/references/implementation.md` for the full implementation guide.

## Output

- `${CLAUDE_SKILL_DIR}/src/middleware/throttle.js` - Concurrency and request rate throttling middleware
- `${CLAUDE_SKILL_DIR}/src/middleware/circuit-breaker.js` - Circuit breaker for downstream service protection
- `${CLAUDE_SKILL_DIR}/src/middleware/priority-queue.js` - Tier-based request prioritization
- `${CLAUDE_SKILL_DIR}/src/config/throttle-config.js` - Per-endpoint throttle policy definitions
- `${CLAUDE_SKILL_DIR}/tests/throttle/` - Load tests validating throttle engagement and recovery

## Error Handling

| Error | Cause | Solution |
|-------|-------|----------|
| 503 Service Unavailable | Concurrency limit reached for the endpoint | Return `Retry-After` header with estimated wait time; include throttle state headers |
| 503 Circuit Open | Circuit breaker tripped due to downstream failures | Return cached response if available; provide circuit reset time in response body |
| Queue overflow | Request buffer exceeded maximum depth | Reject with 503; alert operations team; consider scaling backend capacity |
| Stale throttle state | Redis connection lost; throttle counters become inaccurate | Fall back to in-process counters; reconnect with backoff; log state inconsistency |
| Priority starvation | Low-tier requests never served under sustained high-tier load | Reserve minimum throughput percentage for each tier to prevent complete starvation |

Refer to `${CLAUDE_SKILL_DIR}/references/errors.md` for comprehensive error patterns.

## Examples

**Database-heavy endpoint protection**: Apply concurrency limit of 10 to a report generation endpoint that runs expensive aggregation queries, queueing additional requests with estimated wait times.

**Multi-tier SaaS throttling**: Enterprise tier gets 100 concurrent requests, Pro tier gets 25, Free tier gets 5, with priority queue ensuring enterprise requests are served first during contention.

**Adaptive autoscaling trigger**: Throttle middleware emits metrics that trigger horizontal pod autoscaling when throttle engagement rate exceeds 20% sustained over 5 minutes.

See `${CLAUDE_SKILL_DIR}/references/examples.md` for additional examples.

## Resources

- Circuit Breaker pattern: Martin Fowler's design patterns
- Resilience4j (Java) and cockatiel (Node.js) circuit breaker libraries
- Netflix Concurrency Limits library for adaptive throttling
- Token bucket and leaky bucket algorithm implementations

Related Skills

fuzzing-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Configure perform API fuzzing to discover edge cases, crashes, and security vulnerabilities. Use when performing specialized testing. Trigger with phrases like "fuzz the API", "run fuzzing tests", or "discover edge cases".

generating-rest-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Generate complete REST API implementations from OpenAPI specifications or database schemas. Use when generating RESTful API implementations. Trigger with phrases like "generate REST API", "create RESTful API", or "build REST endpoints".

versioning-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Implement API versioning with backward compatibility, deprecation notices, and migration paths. Use when managing API versions and backward compatibility. Trigger with phrases like "version the API", "manage API versions", or "handle API versioning".

rate-limiting-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Implement sophisticated rate limiting with sliding windows, token buckets, and quotas. Use when protecting APIs from excessive requests. Trigger with phrases like "add rate limiting", "limit API requests", or "implement rate limits".

monitoring-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Build real-time API monitoring dashboards with metrics, alerts, and health checks. Use when tracking API health and performance metrics. Trigger with phrases like "monitor the API", "add API metrics", or "setup API monitoring".

mocking-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Generate mock API servers for testing and development with realistic response data. Use when creating mock APIs for development and testing. Trigger with phrases like "create mock API", "generate API mock", or "setup mock server".

migrating-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Implement API migrations between versions, platforms, or frameworks with minimal downtime. Use when upgrading APIs between versions. Trigger with phrases like "migrate the API", "upgrade API version", or "migrate to new API".

load-testing-apis

1868
from jeremylongshore/claude-code-plugins-plus-skills

Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".

api-throttling-setup

1868
from jeremylongshore/claude-code-plugins-plus-skills

Api Throttling Setup - Auto-activating skill for API Development. Triggers on: api throttling setup, api throttling setup Part of the API Development skill category.

schema-optimization-orchestrator

1868
from jeremylongshore/claude-code-plugins-plus-skills

Multi-phase schema optimization workflow orchestrator. Creates session directories, spawns phase agents sequentially, validates outputs, aggregates results. Trigger: "run schema optimization", "optimize schema workflow", "execute schema phases"

test-skill

1868
from jeremylongshore/claude-code-plugins-plus-skills

Test skill for E2E validation. Trigger with "run test skill" or "execute test". Use this skill when testing skill activation and tool permissions.

example-skill

1868
from jeremylongshore/claude-code-plugins-plus-skills

Brief description of what this skill does and when the model should activate it. Use when [describe the user's intent or situation]. Trigger with "example phrase", "another trigger", "/example-skill".