throttling-apis

Implement API throttling policies to protect backend services from overload. Use when controlling API request rates. Trigger with phrases like "throttle API", "control request rate", or "add throttling".

25 stars

Best use case

throttling-apis is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Implement API throttling policies to protect backend services from overload. Use when controlling API request rates. Trigger with phrases like "throttle API", "control request rate", or "add throttling".

Teams using throttling-apis should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/throttling-apis/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/jeremylongshore/claude-code-plugins-plus-skills/throttling-apis/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/throttling-apis/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How throttling-apis Compares

Feature / Agentthrottling-apisStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Implement API throttling policies to protect backend services from overload. Use when controlling API request rates. Trigger with phrases like "throttle API", "control request rate", or "add throttling".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Throttling APIs

## Overview

Implement API throttling policies that protect backend services from overload by controlling request concurrency, queue depth, and processing rates. Apply backpressure mechanisms including concurrent request limits, priority queues, circuit breakers, and adaptive throttling that adjusts limits based on real-time backend health metrics.

## Prerequisites

- Middleware-capable web framework (Express, FastAPI, Spring Boot, Gin)
- Redis or in-memory store for distributed throttle state tracking
- Monitoring system exposing backend latency and error rate metrics (Prometheus, CloudWatch)
- Load testing tool (k6, Artillery, wrk) for validating throttle behavior under pressure
- Queue system for request buffering during throttle events (optional: Bull, SQS)

## Instructions

1. Analyze existing route handlers and middleware using Grep and Read to identify endpoints with high latency, database-heavy operations, or external service dependencies that need throttle protection.
2. Implement a concurrency limiter middleware that tracks in-flight requests per endpoint and rejects new requests with 503 Service Unavailable when the concurrent limit is reached.
3. Add priority queue support that classifies requests by API key tier (free, pro, enterprise) and serves higher-tier requests first when approaching throttle limits.
4. Build a circuit breaker for downstream service calls that opens after configurable failure thresholds (e.g., 5 failures in 10 seconds), returning 503 with `Retry-After` during the open state.
5. Configure adaptive throttling that monitors backend response latency percentiles (p95, p99) and automatically reduces concurrency limits when latency exceeds SLO thresholds.
6. Add throttle state headers to all responses: `X-Throttle-Limit`, `X-Throttle-Remaining`, and `X-Throttle-Reset` for client-side awareness.
7. Implement graceful degradation strategies per endpoint: serve cached responses, return partial results, or queue requests for deferred processing.
8. Write load tests that verify throttle engagement at expected thresholds, proper 503 responses with `Retry-After`, and recovery behavior when load subsides.

See `${CLAUDE_SKILL_DIR}/references/implementation.md` for the full implementation guide.

## Output

- `${CLAUDE_SKILL_DIR}/src/middleware/throttle.js` - Concurrency and request rate throttling middleware
- `${CLAUDE_SKILL_DIR}/src/middleware/circuit-breaker.js` - Circuit breaker for downstream service protection
- `${CLAUDE_SKILL_DIR}/src/middleware/priority-queue.js` - Tier-based request prioritization
- `${CLAUDE_SKILL_DIR}/src/config/throttle-config.js` - Per-endpoint throttle policy definitions
- `${CLAUDE_SKILL_DIR}/tests/throttle/` - Load tests validating throttle engagement and recovery

## Error Handling

| Error | Cause | Solution |
|-------|-------|----------|
| 503 Service Unavailable | Concurrency limit reached for the endpoint | Return `Retry-After` header with estimated wait time; include throttle state headers |
| 503 Circuit Open | Circuit breaker tripped due to downstream failures | Return cached response if available; provide circuit reset time in response body |
| Queue overflow | Request buffer exceeded maximum depth | Reject with 503; alert operations team; consider scaling backend capacity |
| Stale throttle state | Redis connection lost; throttle counters become inaccurate | Fall back to in-process counters; reconnect with backoff; log state inconsistency |
| Priority starvation | Low-tier requests never served under sustained high-tier load | Reserve minimum throughput percentage for each tier to prevent complete starvation |

Refer to `${CLAUDE_SKILL_DIR}/references/errors.md` for comprehensive error patterns.

## Examples

**Database-heavy endpoint protection**: Apply concurrency limit of 10 to a report generation endpoint that runs expensive aggregation queries, queueing additional requests with estimated wait times.

**Multi-tier SaaS throttling**: Enterprise tier gets 100 concurrent requests, Pro tier gets 25, Free tier gets 5, with priority queue ensuring enterprise requests are served first during contention.

**Adaptive autoscaling trigger**: Throttle middleware emits metrics that trigger horizontal pod autoscaling when throttle engagement rate exceeds 20% sustained over 5 minutes.

See `${CLAUDE_SKILL_DIR}/references/examples.md` for additional examples.

## Resources

- Circuit Breaker pattern: Martin Fowler's design patterns
- Resilience4j (Java) and cockatiel (Node.js) circuit breaker libraries
- Netflix Concurrency Limits library for adaptive throttling
- Token bucket and leaky bucket algorithm implementations

Related Skills

versioning-apis

25
from ComeOnOliver/skillshub

Implement API versioning with backward compatibility, deprecation notices, and migration paths. Use when managing API versions and backward compatibility. Trigger with phrases like "version the API", "manage API versions", or "handle API versioning".

rate-limiting-apis

25
from ComeOnOliver/skillshub

Implement sophisticated rate limiting with sliding windows, token buckets, and quotas. Use when protecting APIs from excessive requests. Trigger with phrases like "add rate limiting", "limit API requests", or "implement rate limits".

monitoring-apis

25
from ComeOnOliver/skillshub

Build real-time API monitoring dashboards with metrics, alerts, and health checks. Use when tracking API health and performance metrics. Trigger with phrases like "monitor the API", "add API metrics", or "setup API monitoring".

mocking-apis

25
from ComeOnOliver/skillshub

Generate mock API servers for testing and development with realistic response data. Use when creating mock APIs for development and testing. Trigger with phrases like "create mock API", "generate API mock", or "setup mock server".

migrating-apis

25
from ComeOnOliver/skillshub

Implement API migrations between versions, platforms, or frameworks with minimal downtime. Use when upgrading APIs between versions. Trigger with phrases like "migrate the API", "upgrade API version", or "migrate to new API".

load-testing-apis

25
from ComeOnOliver/skillshub

Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".

generating-rest-apis

25
from ComeOnOliver/skillshub

Generate complete REST API implementations from OpenAPI specifications or database schemas. Use when generating RESTful API implementations. Trigger with phrases like "generate REST API", "create RESTful API", or "build REST endpoints".

api-throttling-setup

25
from ComeOnOliver/skillshub

Api Throttling Setup - Auto-activating skill for API Development. Triggers on: api throttling setup, api throttling setup Part of the API Development skill category.

fuzzing-apis

25
from ComeOnOliver/skillshub

This skill enables Claude to perform automated fuzz testing on APIs to discover vulnerabilities, crashes, and unexpected behavior. It leverages malformed inputs, boundary values, and random payloads to generate comprehensive fuzz test suites. Use this skill when you need to identify potential SQL injection, XSS, command injection vulnerabilities, input validation failures, and edge cases in APIs. Trigger this skill by requesting fuzz testing, vulnerability scanning, or security analysis of an API. The skill is invoked using the `/fuzz-api` command.

tRPC — End-to-End Type-Safe APIs

25
from ComeOnOliver/skillshub

You are an expert in tRPC, the framework for building type-safe APIs without schemas or code generation. You help developers create full-stack TypeScript applications where the server defines procedures and the client calls them with full type inference — no REST routes, no GraphQL schemas, no OpenAPI specs, just TypeScript functions that are type-safe from database to UI.

eodhd-apis-automation

25
from ComeOnOliver/skillshub

Automate Eodhd Apis tasks via Rube MCP (Composio). Always search tools first for current schemas.

designing-apis

25
from ComeOnOliver/skillshub

Designs REST and GraphQL APIs including endpoints, error handling, versioning, and documentation. Use when creating new APIs, designing endpoints, reviewing API contracts, or when asked about REST, GraphQL, or API patterns.