analyze-test-failures
This skill should be used when the user asks to "analyze failing tests", "debug test failures", "investigate test errors", or provides specific failing test cases to examine. Analyzes failing test cases with a balanced, investigative approach to determine whether failures indicate test issues or genuine bugs.
Best use case
analyze-test-failures is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
This skill should be used when the user asks to "analyze failing tests", "debug test failures", "investigate test errors", or provides specific failing test cases to examine. Analyzes failing test cases with a balanced, investigative approach to determine whether failures indicate test issues or genuine bugs.
Teams using analyze-test-failures should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/analyze-test-failures/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How analyze-test-failures Compares
| Feature / Agent | analyze-test-failures | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
This skill should be used when the user asks to "analyze failing tests", "debug test failures", "investigate test errors", or provides specific failing test cases to examine. Analyzes failing test cases with a balanced, investigative approach to determine whether failures indicate test issues or genuine bugs.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Analyze Test Failures
Analyze failing test cases with a balanced, investigative approach.
## Context
When tests fail, there are two primary possibilities:
1. **False positive**: The test itself is incorrect
2. **True positive**: The test discovered a genuine bug
Assuming tests are wrong by default is a dangerous anti-pattern that defeats the purpose of testing.
## Analysis Process
### 1. Initial Analysis
- Read the failing test carefully, understanding its intent
- Examine the test's assertions and expected behavior
- Review the error message and stack trace
### 2. Investigate the Implementation
- Check the actual implementation being tested
- Trace through the code path that leads to the failure
- Verify that implementation matches documented behavior
### 3. Apply Critical Thinking
For each failing test, ask:
- What behavior is the test trying to verify?
- Is this behavior clearly documented or implied by the API design?
- Does the current implementation actually provide this behavior?
- Could this be an edge case the implementation missed?
### 4. Make a Determination
Classify the failure as one of:
| Classification | Meaning |
| ---------------------- | --------------------------------- |
| **Test Bug** | Test's expectations are incorrect |
| **Implementation Bug** | Code doesn't behave as it should |
| **Ambiguous** | Intended behavior is unclear |
### 5. Document Reasoning
Provide clear explanation including:
- Evidence supporting the conclusion
- Specific mismatch between expectation and reality
- Recommended fix (to test or implementation)
## Example Analyses
### Example 1: Ambiguous Behavior
**Scenario**: Test expects `calculateDiscount(100, 0.2)` to return 20, but it returns 80
**Analysis**:
- Test assumes function returns discount amount
- Implementation returns price after discount
- Function name is ambiguous
**Determination**: Ambiguous
**Recommendation**: Check documentation or clarify intended behavior
### Example 2: Implementation Bug
**Scenario**: Test expects `validateEmail("user@example.com")` to return true, but it returns false
**Analysis**:
- Test provides a valid email format
- Implementation regex is missing support for dots in domain
- Other valid emails also fail
**Determination**: Implementation Bug
**Recommendation**: Fix the regex to properly validate email addresses per RFC standards
### Example 3: Test Bug
**Scenario**: Test expects `divide(10, 0)` to return 0, but it throws an error
**Analysis**:
- Test assumes division by zero returns 0
- Implementation throws DivisionByZeroError
- Standard mathematical behavior is to treat as undefined/error
**Determination**: Test Bug
**Recommendation**: Update test to expect an error, not 0
## Output Format
For each failing test, provide:
```text
Test: [test name/description]
Failure: [what failed and how]
Investigation:
- Test expects: [expected behavior]
- Implementation does: [actual behavior]
- Root cause: [why they differ]
Determination: [Test Bug | Implementation Bug | Ambiguous]
Recommendation:
[Specific fix to either test or implementation]
```
## Key Principles
- NEVER automatically assume the test is wrong
- ALWAYS consider that the test might have found a real bug
- When uncertain, lean toward investigating the implementation
- Tests are often your specification - they define expected behavior
- A failing test is a gift - it's either catching a bug or clarifying requirements
## Related Skills
- **test-failure-mindset**: Set investigative approach for session
- **comprehensive-test-review**: Full test suite reviewRelated Skills
e2e-testing
End-to-end testing workflow with Playwright for browser automation, visual regression, cross-browser testing, and CI/CD integration.
e2e-testing-patterns
Master end-to-end testing with Playwright and Cypress to build reliable test suites that catch bugs, improve confidence, and enable fast deployment. Use when implementing E2E tests, debugging flaky tests, or establishing testing standards.
e2e-outside-in-test-generator
Generates comprehensive end-to-end Playwright tests using outside-in methodology
dotnet-uno-testing
Tests Uno Platform apps. Playwright for WASM, platform-specific patterns, runtime heads.
cve-testing
CVE vulnerability testing coordinator that identifies technology stacks, researches known vulnerabilities, and tests applications for exploitable CVEs using public exploits and proof-of-concept code.
cui-javascript-unit-testing
Jest unit testing standards covering configuration, test structure, testing patterns, and coverage requirements
core-tester
Comprehensive testing and quality assurance specialist for ensuring code quality through testing strategies
configure-ux-testing
Check and configure UX testing infrastructure (Playwright, accessibility, visual regression)
comprehensive-unit-testing-with-pytest
Aims for high test coverage using pytest, testing both common and edge cases.
Burp Suite Web Application Testing
This skill should be used when the user asks to "intercept HTTP traffic", "modify web requests", "use Burp Suite for testing", "perform web vulnerability scanning", "test with Burp Repeater", "analyze HTTP history", or "configure proxy for web testing". It provides comprehensive guidance for using Burp Suite's core features for web application security testing.
burp-suite-testing
This skill should be used when the user asks to "intercept HTTP traffic", "modify web requests", "use Burp Suite for testing", "perform web vulnerability scanning", "test with Burp ...
backtesting-frameworks
Build robust backtesting systems for trading strategies with proper handling of look-ahead bias, survivorship bias, and transaction costs. Use when developing trading algorithms, validating strateg...