load-testing-apis

Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".

1,868 stars

byjeremylongshore

View on GitHub Installation ↓

Best use case

load-testing-apis is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using load-testing-apis should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/load-testing-apis/SKILL.md --create-dirs "https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/main/plugins/api-development/api-load-tester/skills/load-testing-apis/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/load-testing-apis/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How load-testing-apis Compares

Feature / Agent	load-testing-apis	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# Load Testing APIs

## Overview

Execute comprehensive load, stress, and soak tests to validate API performance, identify bottlenecks, and establish throughput baselines. Generate test scripts for k6, Artillery, or wrk that simulate realistic traffic patterns with configurable virtual user ramp-up, request distribution, and failure threshold assertions.

## Prerequisites

- Load testing tool installed: k6 (recommended), Artillery, wrk, or Apache JMeter
- Target API deployed in a staging/performance environment (never load test production without safeguards)
- Monitoring stack accessible: Grafana/Prometheus, Datadog, or CloudWatch for correlating test results with server metrics
- API authentication credentials for testing (API keys, test user JWT tokens)
- Baseline performance SLOs defined (target p95 latency, max error rate, minimum throughput)

## Instructions

1. Read the API specification and route definitions using Glob and Read to build a complete list of endpoints, identifying high-traffic paths and resource-intensive operations.
2. Define test scenarios modeling realistic user behavior: browsing (80% reads), checkout (mixed reads + writes), and spike traffic patterns with appropriate think times between requests.
3. Generate k6 or Artillery test scripts with configurable stages: ramp-up (2 min), sustained load (10 min), spike (2 min at 3x), and cool-down (2 min).
4. Configure request distribution to match production traffic patterns -- weighted random selection across endpoints rather than uniform distribution.
5. Add threshold assertions for pass/fail criteria: p95 response time < 500ms, error rate < 1%, throughput > 100 requests/second.
6. Implement data-driven requests using CSV or JSON fixtures for realistic payloads, unique user IDs, and varied query parameters to avoid cache-only testing.
7. Execute baseline test at expected production load, then gradually increase to 2x, 5x, and 10x to identify the breaking point and saturation behavior.
8. Analyze results: correlate latency spikes with server metrics (CPU, memory, DB connections, event loop lag), identify the bottleneck (database, network, compute), and document findings.
9. Generate a performance report comparing results against SLO thresholds with recommendations for optimization.

See `${CLAUDE_SKILL_DIR}/references/implementation.md` for the full implementation guide.

## Output

- `${CLAUDE_SKILL_DIR}/load-tests/scenarios/` - k6/Artillery test scripts per traffic scenario
- `${CLAUDE_SKILL_DIR}/load-tests/data/` - Test data fixtures (users, payloads, tokens)
- `${CLAUDE_SKILL_DIR}/load-tests/thresholds.json` - Pass/fail threshold configuration
- `${CLAUDE_SKILL_DIR}/reports/load-test-results.json` - Raw test results with timing data
- `${CLAUDE_SKILL_DIR}/reports/load-test-summary.md` - Human-readable performance analysis report
- `${CLAUDE_SKILL_DIR}/reports/bottleneck-analysis.md` - Identified bottlenecks with remediation recommendations

## Error Handling

| Error | Cause | Solution |
|-------|-------|----------|
| Connection refused | Target server ran out of file descriptors or connection pool exhausted | Increase server `ulimit` and connection pool size; note the concurrent connection limit |
| Timeout spike at ramp-up | Server cannot handle connection establishment rate | Implement connection pre-warming; increase ramp-up duration; add connection pooling |
| 429 responses dominate results | Rate limiter engaging during load test | Whitelist load test source IPs in rate limiter; or test rate limiter behavior separately |
| Inconsistent baseline results | Shared staging environment with other traffic | Isolate test environment; run tests during off-hours; use dedicated performance environment |
| Memory leak detected | Soak test shows steadily increasing memory over hours | Flag for development team; identify leaking endpoint by isolating test scenarios |

Refer to `${CLAUDE_SKILL_DIR}/references/errors.md` for comprehensive error patterns.

## Examples

**E-commerce checkout flow**: Simulate 500 concurrent users browsing products (GET, 70%), adding to cart (POST, 20%), and completing checkout (POST, 10%) with 2-5 second think times between actions.

**API spike test**: Ramp from 50 to 1000 virtual users in 30 seconds to simulate traffic spike from marketing campaign launch, verifying the auto-scaler responds and latency recovers within 60 seconds.

**Soak test for memory leaks**: Sustain 200 concurrent users for 4 hours, monitoring server memory, connection counts, and response times for degradation patterns indicating resource leaks.

See `${CLAUDE_SKILL_DIR}/references/examples.md` for additional examples.

## Resources

- k6 documentation: https://k6.io/docs/
- Artillery documentation: https://www.artillery.io/docs
- Google SRE: Load Testing chapter
- Performance testing anti-patterns and best practices

Related Skills

testing-visual-regression

1868

from jeremylongshore/claude-code-plugins-plus-skills

Detect visual changes in UI components using screenshot comparison. Use when detecting unintended UI changes or pixel differences. Trigger with phrases like "test visual changes", "compare screenshots", or "detect UI regressions".

performing-security-testing

1868

from jeremylongshore/claude-code-plugins-plus-skills

Test automate security vulnerability testing covering OWASP Top 10, SQL injection, XSS, CSRF, and authentication issues. Use when performing security assessments, penetration tests, or vulnerability scans. Trigger with phrases like "scan for vulnerabilities", "test security", or "run penetration test".

testing-mobile-apps

1868

from jeremylongshore/claude-code-plugins-plus-skills

Execute mobile app testing on iOS and Android devices/simulators. Use when performing specialized testing. Trigger with phrases like "test mobile app", "run iOS tests", or "validate Android functionality".

testing-load-balancers

1868

from jeremylongshore/claude-code-plugins-plus-skills

Validate load balancer behavior, failover, and traffic distribution. Use when performing specialized testing. Trigger with phrases like "test load balancer", "validate failover", or "check traffic distribution".

testing-browser-compatibility

1868

from jeremylongshore/claude-code-plugins-plus-skills

Test across multiple browsers and devices for cross-browser compatibility. Use when ensuring cross-browser or device compatibility with BrowserStack, Sauce Labs, LambdaTest, or Kobiton. Trigger with phrases like "test browser compatibility", "check cross-browser", "validate on browsers", "test on real devices", "kobiton test".

automating-api-testing

1868

from jeremylongshore/claude-code-plugins-plus-skills

Test automate API endpoint testing including request generation, validation, and comprehensive test coverage for REST and GraphQL APIs. Use when testing API contracts, validating OpenAPI specifications, or ensuring endpoint reliability. Trigger with phrases like "test the API", "generate API tests", or "validate API contracts".

fuzzing-apis

1868

from jeremylongshore/claude-code-plugins-plus-skills

Configure perform API fuzzing to discover edge cases, crashes, and security vulnerabilities. Use when performing specialized testing. Trigger with phrases like "fuzz the API", "run fuzzing tests", or "discover edge cases".

performing-penetration-testing

1868

from jeremylongshore/claude-code-plugins-plus-skills

Perform security testing on web applications, APIs, and codebases. Use when the user asks to "run a security scan", "check for vulnerabilities", "audit dependencies", "check security headers", "find security issues", "pentest", "security audit", or "scan for secrets". Trigger with "pentest", "security scan", "vulnerability check", "audit dependencies", "check headers", "find secrets".

windsurf-load-scale

1868

from jeremylongshore/claude-code-plugins-plus-skills

Scale Windsurf adoption across large organizations with workspace strategies and performance tuning. Use when rolling out Windsurf to 50+ developers, managing large monorepo workspaces, or planning enterprise-scale deployment. Trigger with phrases like "windsurf at scale", "windsurf large team", "windsurf monorepo", "windsurf organization", "windsurf 100 developers".

vercel-load-scale

1868

from jeremylongshore/claude-code-plugins-plus-skills

Load test and scale Vercel deployments with concurrency tuning and capacity planning. Use when running performance tests, planning for traffic spikes, or optimizing serverless function scaling on Vercel. Trigger with phrases like "vercel load test", "vercel scale", "vercel performance test", "vercel capacity", "vercel benchmark".

supabase-load-scale

1868

from jeremylongshore/claude-code-plugins-plus-skills

Scale Supabase projects for production load: read replicas, connection pooling tuning via Supavisor, compute size upgrades, CDN caching for Storage, Edge Function regional deployment, and database table partitioning. Use when preparing for traffic spikes, optimizing connection limits, setting up read replicas for analytics queries, or partitioning large tables. Trigger with phrases like "supabase scale", "supabase read replica", "supabase connection pooling", "supabase compute upgrade", "supabase CDN storage", "supabase edge function regions", "supabase partitioning", "supavisor", "supabase pool mode".

snowflake-load-scale

1868

from jeremylongshore/claude-code-plugins-plus-skills

Implement Snowflake load testing, warehouse scaling, and capacity planning. Use when testing query performance at scale, configuring multi-cluster warehouses, or planning capacity for production Snowflake workloads. Trigger with phrases like "snowflake load test", "snowflake scale", "snowflake capacity", "snowflake benchmark", "snowflake multi-cluster".