test-failure-investigator
Use when a test is failing and you need to determine root cause: is it flaky, an environment issue, or a real regression? Traces failure from symptom to fix.
Best use case
test-failure-investigator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use when a test is failing and you need to determine root cause: is it flaky, an environment issue, or a real regression? Traces failure from symptom to fix.
Teams using test-failure-investigator should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/test-failure-investigator/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How test-failure-investigator Compares
| Feature / Agent | test-failure-investigator | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use when a test is failing and you need to determine root cause: is it flaky, an environment issue, or a real regression? Traces failure from symptom to fix.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Test Failure Investigator
Runbook-style skill for systematic test failure investigation. Given a failing test, determines root cause and recommends action.
## Activation
```
/test-failure-investigator [test-name-or-file]
```
## Investigation Flow
### Step 1: Classify the Failure
Run the test 3 times and classify:
| Result Pattern | Classification | Action |
|---------------|---------------|--------|
| Fails consistently | **Regression** or **Environment** | Continue to Step 2 |
| Fails intermittently | **Flaky** | Skip to Step 4 |
| Passes now | **Transient** | Check CI logs, environment diff |
```bash
# Run test 3 times
for i in 1 2 3; do npx jest {{test_file}} 2>&1 | tail -5; echo "--- Run $i ---"; done
```
### Step 2: Narrow the Scope
```bash
# When did it start failing?
git log --oneline -20 -- {{related_source_files}}
# What changed recently?
git diff HEAD~5 -- {{related_source_files}}
# Does it fail in isolation?
npx jest {{test_file}} --testNamePattern="{{test_name}}"
# Does it fail with other tests?
npx jest --runInBand # sequential execution
```
### Step 3: Root Cause Analysis
| Symptom | Likely Cause | Investigation |
|---------|-------------|--------------|
| Timeout | Network/DB dependency | Check external service availability |
| Assertion mismatch | Logic change | Compare expected vs actual, check git blame |
| Import error | Dependency change | Check package.json changes, run `npm ci` |
| Permission denied | Environment | Check file permissions, Docker volumes |
| Out of memory | Resource leak | Profile with `--detectOpenHandles` |
### Step 4: Flaky Test Investigation
```bash
# Run 10 times to confirm flakiness
for i in $(seq 1 10); do npx jest {{test_file}} --forceExit 2>&1 | grep -E 'PASS|FAIL'; done
# Common flaky causes:
# - Shared state between tests (missing cleanup)
# - Time-dependent assertions (use fake timers)
# - Race conditions (missing await)
# - Port conflicts (use random ports)
# - Order dependency (run with --randomize)
```
### Step 5: Report
```markdown
## Test Failure Report
- **Test**: {{test_name}}
- **File**: {{test_file}}
- **Classification**: Regression / Flaky / Environment / Transient
- **Root Cause**: {{description}}
- **First Failed**: {{commit_hash}} ({{date}})
- **Fix**: {{recommended_action}}
- **Verified**: [ ] Fix applied and test passes 3x consecutively
```
## Composition
After investigation, compose with:
- **`/bug-reporting-excellence`** — if regression found, file a bug report
- **`/regression-testing`** — if regression, add to regression suite
- **`/qe-test-execution`** — for re-running tests after fix
## Gotchas
- Agent may guess at root cause without running the test — always reproduce first
- "Works on my machine" is not a diagnosis — compare environments (node version, OS, deps)
- Flaky tests that pass 9/10 times will still be reported as "passing" by CI — run 10+ times
- Test isolation failures are the #1 cause of flaky tests — check for shared state in beforeAll/afterAllRelated Skills
qe-visual-testing-advanced
Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.
qe-testability-scoring
AI-powered testability assessment using 10 principles of intrinsic testability with Playwright and optional Vibium integration. Evaluates web applications against Observability, Controllability, Algorithmic Simplicity, Transparency, Stability, Explainability, Unbugginess, Smallness, Decomposability, and Similarity. Use when assessing software testability, evaluating test readiness, identifying testability improvements, or generating testability reports.
qe-test-reporting-analytics
Advanced test reporting, quality dashboards, predictive analytics, trend analysis, and executive reporting for QE metrics. Use when communicating quality status, tracking trends, or making data-driven decisions.
qe-test-idea-rewriting
Transform passive 'Verify X' test descriptions into active, observable test actions. Use when test ideas lack specificity, use vague language, or fail quality validation. Converts to action-verb format for clearer, more testable descriptions.
qe-test-environment-management
Test environment provisioning, infrastructure as code for testing, Docker/Kubernetes for test environments, service virtualization, and cost optimization. Use when managing test infrastructure, ensuring environment parity, or optimizing testing costs.
qe-test-design-techniques
Systematic test design with boundary value analysis, equivalence partitioning, decision tables, state transition testing, and combinatorial testing. Use when designing comprehensive test cases, reducing redundant tests, or ensuring systematic coverage.
qe-test-data-management
Strategic test data generation, management, and privacy compliance. Use when creating test data, handling PII, ensuring GDPR/CCPA compliance, or scaling data generation for realistic testing scenarios.
qe-test-automation-strategy
Design and implement effective test automation with proper pyramid, patterns, and CI/CD integration. Use when building automation frameworks or improving test efficiency.
qe-shift-right-testing
Testing in production with feature flags, canary deployments, synthetic monitoring, and chaos engineering. Use when implementing production observability or progressive delivery.
qe-shift-left-testing
Move testing activities earlier in the development lifecycle to catch defects when they're cheapest to fix. Use when implementing TDD, CI/CD, or early quality practices.
qe-security-visual-testing
Security-first visual testing combining URL validation, PII detection, and visual regression with parallel viewport support. Use when testing web applications that handle sensitive data, need visual regression coverage, or require WCAG accessibility compliance.
qe-security-testing
Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices.