test-failure-investigator

Use when a test is failing and you need to determine root cause: is it flaky, an environment issue, or a real regression? Traces failure from symptom to fix.

298 stars

byproffesor-for-testing

View on GitHub Installation ↓

Best use case

test-failure-investigator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Use when a test is failing and you need to determine root cause: is it flaky, an environment issue, or a real regression? Traces failure from symptom to fix.

Teams using test-failure-investigator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/test-failure-investigator/SKILL.md --create-dirs "https://raw.githubusercontent.com/proffesor-for-testing/agentic-qe/main/.claude/skills/test-failure-investigator/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/test-failure-investigator/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How test-failure-investigator Compares

Feature / Agent	test-failure-investigator	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Use when a test is failing and you need to determine root cause: is it flaky, an environment issue, or a real regression? Traces failure from symptom to fix.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Test Failure Investigator

Runbook-style skill for systematic test failure investigation. Given a failing test, determines root cause and recommends action.

## Activation

```
/test-failure-investigator [test-name-or-file]
```

## Investigation Flow

### Step 1: Classify the Failure

Run the test 3 times and classify:

| Result Pattern | Classification | Action |
|---------------|---------------|--------|
| Fails consistently | **Regression** or **Environment** | Continue to Step 2 |
| Fails intermittently | **Flaky** | Skip to Step 4 |
| Passes now | **Transient** | Check CI logs, environment diff |

```bash
# Run test 3 times
for i in 1 2 3; do npx jest {{test_file}} 2>&1 | tail -5; echo "--- Run $i ---"; done
```

### Step 2: Narrow the Scope

```bash
# When did it start failing?
git log --oneline -20 -- {{related_source_files}}

# What changed recently?
git diff HEAD~5 -- {{related_source_files}}

# Does it fail in isolation?
npx jest {{test_file}} --testNamePattern="{{test_name}}"

# Does it fail with other tests?
npx jest --runInBand  # sequential execution
```

### Step 3: Root Cause Analysis

| Symptom | Likely Cause | Investigation |
|---------|-------------|--------------|
| Timeout | Network/DB dependency | Check external service availability |
| Assertion mismatch | Logic change | Compare expected vs actual, check git blame |
| Import error | Dependency change | Check package.json changes, run `npm ci` |
| Permission denied | Environment | Check file permissions, Docker volumes |
| Out of memory | Resource leak | Profile with `--detectOpenHandles` |

### Step 4: Flaky Test Investigation

```bash
# Run 10 times to confirm flakiness
for i in $(seq 1 10); do npx jest {{test_file}} --forceExit 2>&1 | grep -E 'PASS|FAIL'; done

# Common flaky causes:
# - Shared state between tests (missing cleanup)
# - Time-dependent assertions (use fake timers)
# - Race conditions (missing await)
# - Port conflicts (use random ports)
# - Order dependency (run with --randomize)
```

### Step 5: Report

```markdown
## Test Failure Report
- **Test**: {{test_name}}
- **File**: {{test_file}}
- **Classification**: Regression / Flaky / Environment / Transient
- **Root Cause**: {{description}}
- **First Failed**: {{commit_hash}} ({{date}})
- **Fix**: {{recommended_action}}
- **Verified**: [ ] Fix applied and test passes 3x consecutively
```

## Composition

After investigation, compose with:
- **`/bug-reporting-excellence`** — if regression found, file a bug report
- **`/regression-testing`** — if regression, add to regression suite
- **`/qe-test-execution`** — for re-running tests after fix

## Gotchas

- Agent may guess at root cause without running the test — always reproduce first
- "Works on my machine" is not a diagnosis — compare environments (node version, OS, deps)
- Flaky tests that pass 9/10 times will still be reported as "passing" by CI — run 10+ times
- Test isolation failures are the #1 cause of flaky tests — check for shared state in beforeAll/afterAll

Related Skills

qe-visual-testing-advanced

298

from proffesor-for-testing/agentic-qe

Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.

qe-testability-scoring

298

from proffesor-for-testing/agentic-qe

AI-powered testability assessment using 10 principles of intrinsic testability with Playwright and optional Vibium integration. Evaluates web applications against Observability, Controllability, Algorithmic Simplicity, Transparency, Stability, Explainability, Unbugginess, Smallness, Decomposability, and Similarity. Use when assessing software testability, evaluating test readiness, identifying testability improvements, or generating testability reports.

qe-test-reporting-analytics

298

from proffesor-for-testing/agentic-qe

Advanced test reporting, quality dashboards, predictive analytics, trend analysis, and executive reporting for QE metrics. Use when communicating quality status, tracking trends, or making data-driven decisions.

qe-test-idea-rewriting

298

from proffesor-for-testing/agentic-qe

Transform passive 'Verify X' test descriptions into active, observable test actions. Use when test ideas lack specificity, use vague language, or fail quality validation. Converts to action-verb format for clearer, more testable descriptions.

qe-test-environment-management

298

from proffesor-for-testing/agentic-qe

Test environment provisioning, infrastructure as code for testing, Docker/Kubernetes for test environments, service virtualization, and cost optimization. Use when managing test infrastructure, ensuring environment parity, or optimizing testing costs.

qe-test-design-techniques

298

from proffesor-for-testing/agentic-qe

Systematic test design with boundary value analysis, equivalence partitioning, decision tables, state transition testing, and combinatorial testing. Use when designing comprehensive test cases, reducing redundant tests, or ensuring systematic coverage.

qe-test-data-management

298

from proffesor-for-testing/agentic-qe

Strategic test data generation, management, and privacy compliance. Use when creating test data, handling PII, ensuring GDPR/CCPA compliance, or scaling data generation for realistic testing scenarios.

qe-test-automation-strategy

298

from proffesor-for-testing/agentic-qe

Design and implement effective test automation with proper pyramid, patterns, and CI/CD integration. Use when building automation frameworks or improving test efficiency.

qe-shift-right-testing

298

from proffesor-for-testing/agentic-qe

Testing in production with feature flags, canary deployments, synthetic monitoring, and chaos engineering. Use when implementing production observability or progressive delivery.

qe-shift-left-testing

298

from proffesor-for-testing/agentic-qe

Move testing activities earlier in the development lifecycle to catch defects when they're cheapest to fix. Use when implementing TDD, CI/CD, or early quality practices.

qe-security-visual-testing

298

from proffesor-for-testing/agentic-qe

Security-first visual testing combining URL validation, PII detection, and visual regression with parallel viewport support. Use when testing web applications that handle sensitive data, need visual regression coverage, or require WCAG accessibility compliance.

qe-security-testing

298

from proffesor-for-testing/agentic-qe

Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices.