perf-benchmarker
Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.
Best use case
perf-benchmarker is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.
Teams using perf-benchmarker should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/perf-benchmarker/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How perf-benchmarker Compares
| Feature / Agent | perf-benchmarker | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# perf-benchmarker
Run sequential benchmarks with strict duration rules.
Follow `docs/perf-requirements.md` as the canonical contract.
## Parse Arguments
```javascript
const args = '$ARGUMENTS'.split(' ').filter(Boolean);
const command = args.find(a => !a.match(/^\d+$/)) || '';
const duration = parseInt(args.find(a => a.match(/^\d+$/)) || '60', 10);
```
## Required Rules
- Benchmarks MUST run sequentially (never parallel).
- Minimum duration: 60s per run (30s only for binary search).
- Warmup: 10s minimum before measurement.
- Re-run anomalies.
## Output Format
```
command: <benchmark command>
duration: <seconds>
warmup: <seconds>
results: <metrics summary>
notes: <anomalies or reruns>
```
## Output Contract
Benchmarks MUST emit a JSON metrics block between markers:
```
PERF_METRICS_START
{"scenarios":{"low":{"latency_ms":120},"high":{"latency_ms":450}}}
PERF_METRICS_END
```
## Constraints
- No short runs unless binary-search phase.
- Do not change code while benchmarking.Related Skills
perf-investigation-logger
Use when appending structured perf investigation notes and evidence.
perf-baseline-manager
Use when managing perf baselines, consolidating results, or comparing versions. Ensures one baseline JSON per version.
perf-profiler
Use when profiling CPU/memory hot paths, generating flame graphs, or capturing JFR/perf evidence.
perf-code-paths
Use when mapping code paths, entrypoints, and likely hot files before profiling.
perf-theory-tester
Use when running controlled perf experiments to validate hypotheses.
perf-theory-gatherer
Use when generating performance hypotheses backed by git history and code evidence.
perf-analyzer
Use when synthesizing perf findings into evidence-backed recommendations and decisions.
web-browse
Browse and interact with web pages headlessly. Use when agent needs to navigate websites, click elements, fill forms, read content, or take screenshots.
repo-intel
Use when user asks to "run repo intel", "generate repo map", "analyze repo", "query hotspots", "check ownership", or "bus factor". Unified static analysis - git history, AST symbols, project metadata.
enhance-prompts
Use when improving general prompts for structure, examples, and constraints.
validate-delivery
Use when user asks to "validate delivery", "check readiness", or "verify completion". Runs tests, build, and requirement checks with pass/fail instructions.
drift-analysis
Use when the user asks about plan drift, reality check, comparing docs to code, project state analysis, roadmap alignment, implementation gaps, or needs guidance on identifying discrepancies between documented plans and actual implementation state.