perf-benchmarker

Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.

677 stars

byagent-sh

View on GitHub Installation ↓

Best use case

perf-benchmarker is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.

Teams using perf-benchmarker should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/perf-benchmarker/SKILL.md --create-dirs "https://raw.githubusercontent.com/agent-sh/agentsys/main/.kiro/skills/perf-benchmarker/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/perf-benchmarker/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How perf-benchmarker Compares

Feature / Agent	perf-benchmarker	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# perf-benchmarker

Run sequential benchmarks with strict duration rules.

Follow `docs/perf-requirements.md` as the canonical contract.

## Parse Arguments

```javascript
const args = '$ARGUMENTS'.split(' ').filter(Boolean);
const command = args.find(a => !a.match(/^\d+$/)) || '';
const duration = parseInt(args.find(a => a.match(/^\d+$/)) || '60', 10);
```

## Required Rules

- Benchmarks MUST run sequentially (never parallel).
- Minimum duration: 60s per run (30s only for binary search).
- Warmup: 10s minimum before measurement.
- Re-run anomalies.

## Output Format

```
command: <benchmark command>
duration: <seconds>
warmup: <seconds>
results: <metrics summary>
notes: <anomalies or reruns>
```

## Output Contract

Benchmarks MUST emit a JSON metrics block between markers:

```
PERF_METRICS_START
{"scenarios":{"low":{"latency_ms":120},"high":{"latency_ms":450}}}
PERF_METRICS_END
```

## Constraints

- No short runs unless binary-search phase.
- Do not change code while benchmarking.

Related Skills

perf-investigation-logger

677

from agent-sh/agentsys

Use when appending structured perf investigation notes and evidence.

perf-baseline-manager

677

from agent-sh/agentsys

Use when managing perf baselines, consolidating results, or comparing versions. Ensures one baseline JSON per version.

perf-profiler

677

from agent-sh/agentsys

Use when profiling CPU/memory hot paths, generating flame graphs, or capturing JFR/perf evidence.

perf-code-paths

677

from agent-sh/agentsys

Use when mapping code paths, entrypoints, and likely hot files before profiling.

perf-theory-tester

677

from agent-sh/agentsys

Use when running controlled perf experiments to validate hypotheses.

perf-theory-gatherer

677

from agent-sh/agentsys

Use when generating performance hypotheses backed by git history and code evidence.

perf-analyzer

677

from agent-sh/agentsys

Use when synthesizing perf findings into evidence-backed recommendations and decisions.

web-browse

677

from agent-sh/agentsys

Browse and interact with web pages headlessly. Use when agent needs to navigate websites, click elements, fill forms, read content, or take screenshots.

repo-intel

677

from agent-sh/agentsys

Use when user asks to "run repo intel", "generate repo map", "analyze repo", "query hotspots", "check ownership", or "bus factor". Unified static analysis - git history, AST symbols, project metadata.

enhance-prompts

677

from agent-sh/agentsys

Use when improving general prompts for structure, examples, and constraints.

validate-delivery

677

from agent-sh/agentsys

Use when user asks to "validate delivery", "check readiness", or "verify completion". Runs tests, build, and requirement checks with pass/fail instructions.

drift-analysis

677

from agent-sh/agentsys

Use when the user asks about plan drift, reality check, comparing docs to code, project state analysis, roadmap alignment, implementation gaps, or needs guidance on identifying discrepancies between documented plans and actual implementation state.