regression-report
Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data with actionable recommendations
Best use case
regression-report is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
It is a strong fit for teams already working in Codex.
Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data with actionable recommendations
Teams using regression-report should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/regression-report/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How regression-report Compares
| Feature / Agent | regression-report | Standard Approach |
|---|---|---|
| Platform Support | Codex | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data with actionable recommendations
Which AI agents support this skill?
This skill is designed for Codex.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
AI Agent for Product Research
Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.
SKILL.md Source
# regression-report
Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data.
## Triggers
Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):
- "full regression report" → comprehensive regression analysis
- "test health" → regression summary report
## Purpose
This skill produces comprehensive regression reports by:
- Synthesizing data from bisect, baseline, and metrics
- Correlating regressions with code changes
- Identifying systemic issues and patterns
- Providing actionable recommendations
- Creating executive summaries
- Supporting postmortem analysis
## Behavior
When triggered, this skill:
1. **Gathers regression data**:
- Load bisect reports for root cause
- Import baseline comparison results
- Fetch regression metrics and trends
- Retrieve issue tracker data
- Collect CI/CD logs
2. **Correlates information**:
- Link regressions to code changes
- Map to requirements and features
- Identify common root causes
- Connect to team/component ownership
3. **Analyzes impact**:
- Assess user impact
- Calculate downtime/degradation
- Measure business cost
- Evaluate reputation impact
4. **Identifies patterns**:
- Recurring regression types
- High-risk components
- Time-based patterns
- Process gaps
5. **Generates insights**:
- Root cause categories
- Prevention strategies
- Process improvements
- Tooling recommendations
6. **Produces reports**:
- Executive summary
- Technical deep-dive
- Action plan
- Lessons learned
## Report Types
### Incident Report
```yaml
incident_report:
description: Single regression deep-dive
scope: One specific regression incident
audience: Technical team and stakeholders
sections:
- incident_summary
- timeline
- root_cause_analysis
- impact_assessment
- resolution_steps
- prevention_measures
- lessons_learned
```
### Sprint Regression Report
```yaml
sprint_report:
description: All regressions in a sprint
scope: Sprint boundary
audience: Development team
sections:
- sprint_summary
- regression_list
- metrics_summary
- hotspot_analysis
- recommendations
- goals_for_next_sprint
```
### Release Regression Report
```yaml
release_report:
description: Regression analysis for release
scope: Release cycle
audience: Release management and stakeholders
sections:
- release_summary
- regression_timeline
- escape_analysis
- quality_gates_review
- lessons_learned
- release_readiness
```
### Quarterly Regression Analysis
```yaml
quarterly_report:
description: Strategic regression analysis
scope: 3 months
audience: Leadership and engineering managers
sections:
- executive_summary
- trend_analysis
- strategic_insights
- investment_recommendations
- process_improvements
- success_metrics
```
## Comprehensive Report Format
```markdown
# Comprehensive Regression Report
**Period**: Sprint 13 (2026-01-14 to 2026-01-28)
**Project**: User Service
**Report Type**: Sprint Regression Analysis
**Generated**: 2026-01-28 17:00:00
**Analyzer**: regression-report skill
---
## Executive Summary
**Overall Assessment**: ⚠️ Acceptable Quality with Concerns
| Metric | Value | Status |
|--------|-------|--------|
| Regressions Detected | 4 | ✅ Within target (< 5) |
| Production Escapes | 1 | ⚠️ Above target (0 preferred) |
| Mean Time to Detect | 8.5h | ✅ Excellent |
| Mean Time to Fix | 18.7h | ⚠️ Close to target |
| User Impact | 500 users | ⚠️ Medium |
**Key Findings**:
- Regression rate within acceptable range but auth module concerning
- One production escape indicates staging test gap
- Detection speed excellent, fix time acceptable
- Systemic issue: Integration testing coverage insufficient
**Priority Actions**:
1. Add integration tests for authentication flows
2. Improve staging environment test coverage
3. Implement regression test requirement for all fixes
---
## Regression Inventory
### REG-001: JWT Issuer Validation Breaks Existing Sessions
**Severity**: High
**Status**: Fixed (deployed 2026-01-16)
**Detection**: Automated test failure
**Environment**: Staging
**Timeline**:
- 2026-01-15 14:32: Breaking commit merged (pqr901)
- 2026-01-15 16:15: CI test failure detected (2h MTTD)
- 2026-01-15 18:45: Root cause identified via bisect
- 2026-01-16 09:30: Fix deployed (15h MTTF)
**Root Cause**: Required `iss` claim in JWT validation without backward compatibility
**Impact**: All existing user sessions invalidated
**Files Changed**:
- src/auth/validate-token.ts (+15/-5)
- src/auth/generate-token.ts (+8/-2)
**Fix**: Added feature flag for gradual rollout
**Prevention**:
- [ ] Add integration tests for auth changes
- [ ] Require backward compatibility review
**References**:
- Bisect Report: @.aiwg/testing/regression-bisect-jwt.md
- Issue: #456
- PR: #789
---
### REG-002: User Profile Update Returns 500 on Invalid Email
**Severity**: Critical
**Status**: Fixed (deployed 2026-01-18)
**Detection**: Production monitoring
**Environment**: Production
**Timeline**:
- 2026-01-17 08:00: Deployed to production
- 2026-01-17 20:15: Error spike detected (12h MTTD)
- 2026-01-18 02:30: Root cause identified
- 2026-01-18 04:15: Hotfix deployed (8h MTTF)
**Root Cause**: Validation middleware error handling broken, returns 500 instead of 400
**Impact**:
- 500 users affected
- 45 support tickets
- 12 hours partial service degradation
**Files Changed**:
- src/api/validation-middleware.ts (+3/-8)
**Cost**:
- Engineering: 8 hours incident response
- Support: 12 hours ticket triage
- Reputation: Customer satisfaction impact
**Fix**: Restored proper error handling
**Prevention**:
- [ ] Add error code validation tests
- [ ] Improve staging test coverage
- [ ] Add monitoring for error rate spikes
**References**:
- Incident Report: @.aiwg/incidents/INC-2026-001.md
- Issue: #478
- Hotfix PR: #791
---
### REG-003: Password Reset Email Fails for Gmail Users
**Severity**: Medium
**Status**: Fixed (deployed 2026-01-22)
**Detection**: Manual QA testing
**Environment**: Staging
**Timeline**:
- 2026-01-20 11:00: Code merged
- 2026-01-21 15:30: QA detected (28.5h MTTD)
- 2026-01-22 10:00: Fix deployed (18.5h MTTF)
**Root Cause**: Email template contained invalid HTML characters rejected by Gmail
**Impact**: Password reset broken for ~40% of user base (Gmail users)
**Files Changed**:
- templates/email/password-reset.html (+5/-3)
**Fix**: HTML entity encoding for special characters
**Prevention**:
- [ ] Add email template validation
- [ ] Test with multiple email providers
- [ ] Add email delivery monitoring
**References**:
- Issue: #482
- PR: #794
---
### REG-004: Dashboard Widget Load Time Increased
**Severity**: Low
**Status**: Fixed (deployed 2026-01-26)
**Detection**: Performance baseline comparison
**Environment**: Staging
**Timeline**:
- 2026-01-24 09:00: Performance test detected slowdown
- 2026-01-24 10:15: Root cause identified (1.25h MTTD)
- 2026-01-26 08:00: Optimization deployed (45.75h MTTF)
**Root Cause**: N+1 query introduced in dashboard widget
**Impact**: Dashboard load time increased from 0.8s to 2.3s
**Files Changed**:
- src/dashboard/widgets/activity-feed.ts (+12/-5)
**Fix**: Added eager loading to eliminate N+1
**Prevention**:
- [ ] Add performance regression tests
- [ ] Code review focus on query optimization
**References**:
- Baseline Comparison: @.aiwg/testing/baseline-comparison-perf.md
- Issue: #485
- PR: #797
---
## Impact Analysis
### User Impact
| Regression | Users Affected | Duration | Severity |
|------------|----------------|----------|----------|
| REG-001 | 0 (staging) | N/A | High (avoided) |
| REG-002 | 500 | 12h | Critical |
| REG-003 | 0 (staging) | N/A | Medium (avoided) |
| REG-004 | 0 (staging) | N/A | Low (avoided) |
**Total Production Impact**: 500 users, 12 hours
**Escapes Prevented**: 3 of 4 regressions caught pre-production
### Business Impact
```
Direct Costs:
- Engineering incident response: 24 hours @ $150/h = $3,600
- Support ticket triage: 12 hours @ $75/h = $900
Total Direct: $4,500
Indirect Costs:
- Customer satisfaction impact (estimated)
- Reputation damage (1 production incident)
- Lost productivity (500 users × 12h)
Estimated Indirect: $8,000
Total Estimated Cost: $12,500
```
### Component Impact
| Component | Regressions | User Impact | Risk Level |
|-----------|-------------|-------------|------------|
| src/auth/ | 1 | High (avoided) | 🔴 High |
| src/api/ | 1 | Critical (500 users) | 🔴 High |
| templates/ | 1 | Medium (avoided) | 🟡 Medium |
| src/dashboard/ | 1 | Low (avoided) | 🟢 Low |
**High-Risk Components**: Auth and API modules require increased testing
---
## Root Cause Analysis
### Root Cause Categories
| Category | Count | % | Examples |
|----------|-------|---|----------|
| Missing integration tests | 2 | 50% | REG-001, REG-002 |
| Missing validation | 1 | 25% | REG-003 |
| Performance not tested | 1 | 25% | REG-004 |
**Systemic Issue**: Integration testing coverage insufficient (50% of regressions)
### Prevention Success Rate
| Prevention Measure | Present? | Effective? |
|-------------------|----------|------------|
| Unit tests | ✅ Yes | ⚠️ Partial |
| Integration tests | ❌ No | N/A |
| Performance tests | ⚠️ Partial | ⚠️ Partial |
| Email validation | ❌ No | N/A |
| Staging environment | ✅ Yes | ✅ Yes (caught 3/4) |
**Gap**: Integration tests would have prevented 50% of regressions
---
## Metrics Summary
### Sprint Performance
| Metric | Sprint 13 | Sprint 12 | Change |
|--------|-----------|-----------|--------|
| Regressions | 4 | 4 | → Stable |
| MTTD | 8.5h | 12h | ↓ Improved |
| MTTF | 18.7h | 28.4h | ↓ Improved |
| Escape Rate | 25% (1/4) | 0% | ↑ Worsened |
**Trend**: Detection and fix times improving, but production escape concerning
### Quality Gates
| Gate | Target | Actual | Status |
|------|--------|--------|--------|
| Unit test coverage | 80% | 85% | ✅ Pass |
| Integration test coverage | 60% | 42% | ❌ Fail |
| Performance baseline | No degradation | 1 regression | ⚠️ Warning |
| Security scan | No critical | Pass | ✅ Pass |
**Failed Gate**: Integration test coverage below target
---
## Recommendations
### Immediate (This Week)
**Priority 1: Add Integration Tests for Auth**
- Reason: 1 auth regression, staging escape
- Impact: Prevent auth-related regressions
- Effort: 2 days
- Owner: Test Engineer
- Target: +20% integration coverage
**Priority 2: Fix Email Template Validation**
- Reason: REG-003 could have been automated
- Impact: Catch email issues pre-deployment
- Effort: 1 day
- Owner: DevOps Engineer
**Priority 3: Implement Regression Test Requirement**
- Reason: 25% recurrence rate on fixes without tests
- Impact: Prevent regression recurrence
- Effort: Process change (1 hour)
- Owner: Tech Lead
### Short-term (This Sprint)
**Priority 4: Improve Staging Test Coverage**
- Reason: Production escape indicates gap
- Impact: Reduce escape rate to <5%
- Effort: 1 week
- Owner: Test Architect
**Priority 5: Add Performance Regression Tests**
- Reason: REG-004 caught late
- Impact: Earlier performance issue detection
- Effort: 3 days
- Owner: Performance Engineer
**Priority 6: Review Error Handling Standards**
- Reason: REG-002 returned 500 instead of 400
- Impact: Consistent error behavior
- Effort: 2 days (review + guidelines)
- Owner: API Designer
### Ongoing
**Priority 7: Require Integration Tests for Auth PRs**
- Reason: Auth module high-risk
- Impact: No auth regressions without tests
- Effort: PR template update
- Owner: Tech Lead
**Priority 8: Weekly Regression Review**
- Reason: Early pattern identification
- Impact: Faster response to trends
- Effort: 30 min/week
- Owner: Test Lead
---
## Lessons Learned
### What Went Well
1. **Fast Detection**: MTTD of 8.5h shows automation working
2. **Staging Environment**: Caught 3 of 4 regressions before production
3. **Bisect Tooling**: Root cause identification very fast
4. **Team Response**: Fast fix times (18.7h MTTF)
### What Didn't Go Well
1. **Production Escape**: REG-002 bypassed all pre-production testing
2. **Integration Coverage**: Too low to catch cross-component issues
3. **Email Validation**: No automated testing for email templates
4. **Backward Compatibility**: REG-001 broke existing sessions
### Process Improvements
| Issue | Improvement | Timeline |
|-------|-------------|----------|
| Production escape | Add integration test gates | This sprint |
| Missing email validation | Automate template testing | This sprint |
| Auth regressions | Require integration tests | Immediate |
| Performance regressions | Add performance baselines | Next sprint |
---
## Sprint Goals for Sprint 14
Based on this analysis, Sprint 14 regression goals:
| Goal | Target | Success Criteria |
|------|--------|------------------|
| Regression Rate | < 4 | Fewer total regressions |
| Integration Coverage | 60% | Meet quality gate |
| Escape Rate | 0% | No production regressions |
| MTTD | < 8h | Maintain current level |
| MTTF | < 18h | Slightly faster fixes |
**Focus Areas**: Integration testing, staging coverage, auth module
---
## Appendices
### A. Regression Timeline
```
2026-01-15: REG-001 introduced (auth JWT)
2026-01-15: REG-001 detected (2h)
2026-01-16: REG-001 fixed (15h)
2026-01-17: REG-002 introduced (validation)
2026-01-17: REG-002 escaped to production
2026-01-18: REG-002 detected (12h)
2026-01-18: REG-002 fixed (8h)
2026-01-20: REG-003 introduced (email)
2026-01-21: REG-003 detected (28.5h)
2026-01-22: REG-003 fixed (18.5h)
2026-01-24: REG-004 introduced (performance)
2026-01-24: REG-004 detected (1.25h)
2026-01-26: REG-004 fixed (45.75h)
```
### B. References
- Bisect Reports: @.aiwg/testing/regression-bisect-*/
- Baseline Comparisons: @.aiwg/testing/baseline-comparisons/
- Metrics Dashboard: @.aiwg/testing/regression-metrics-dashboard.md
- Incident Reports: @.aiwg/incidents/
- Issues: GitHub Issues (label: regression)
### C. Data Sources
- Regression test results: `.aiwg/testing/regression-results/`
- CI/CD logs: GitHub Actions
- Issue tracker: GitHub Issues
- Monitoring: Datadog
- User reports: Support tickets
```
## Usage Examples
### Generate Sprint Report
```
User: "Generate regression report for this sprint"
Skill executes:
1. Identify sprint boundary (Sprint 13)
2. Collect all regression data
3. Correlate with code changes
4. Analyze patterns and impact
5. Generate recommendations
Output:
"Sprint Regression Report Generated
Sprint 13 Summary:
- 4 regressions detected
- 1 production escape
- 8.5h MTTD, 18.7h MTTF
- 500 users impacted
Key Findings:
- Integration testing gap (50% of regressions)
- Auth module high-risk (needs attention)
- Staging caught 75% (good)
Top Recommendations:
1. Add integration tests for auth
2. Improve staging coverage
3. Require regression tests for fixes
Full report: .aiwg/testing/regression-report-sprint-13.md"
```
### Incident Postmortem
```
User: "Regression report for the validation incident"
Skill analyzes:
- REG-002 incident data
- Timeline and impact
- Root cause from bisect
- Related regressions
Output:
"Incident Postmortem: REG-002
Timeline: 12 hours (detection to fix)
Impact: 500 users, 45 support tickets
Cost: ~$12,500 (direct + indirect)
Root Cause: Validation middleware broken
Prevention: Integration tests missing
Recommendations:
- Add error code validation tests
- Improve staging coverage
- Add monitoring for error spikes
Lessons Learned:
- Fast response (8h MTTF)
- Staging gap allowed escape
- Need integration test gate
Full postmortem: .aiwg/testing/regression-postmortem-REG-002.md"
```
### Quarterly Analysis
```
User: "Quarterly regression analysis"
Skill generates:
"Quarterly Regression Analysis (Q1 2026)
Executive Summary:
- 58 total regressions (-35% vs Q4 2025)
- 3 production escapes (-70% vs Q4)
- MTTD: 9.2h (↓ from 24h)
- MTTF: 22h (↓ from 42h)
Strategic Insights:
- Automation investment paying off (76% faster detection)
- Integration testing gap remains (40% of regressions)
- Auth and API modules highest risk
Investment Recommendations:
- $50k: Integration test automation
- $30k: Performance testing platform
- $20k: Staging environment expansion
Full analysis: .aiwg/testing/regression-analysis-Q1-2026.md"
```
## Integration
This skill uses:
- `regression-bisect`: Import root cause analysis
- `regression-baseline`: Import drift data
- `regression-metrics`: Import statistical trends
- `project-awareness`: Detect sprints/releases
- `traceability-check`: Link to requirements
## Agent Orchestration
```yaml
agents:
synthesis:
agent: technical-writer
focus: Report generation and clarity
analysis:
agent: metrics-analyst
focus: Data correlation and insights
recommendations:
agent: test-architect
focus: Prevention strategies
```
## Configuration
### Report Templates
```yaml
report_templates:
incident:
template: templates/regression/incident-report.md
sections: [summary, timeline, root_cause, impact, prevention]
sprint:
template: templates/regression/sprint-report.md
sections: [summary, inventory, metrics, recommendations]
quarterly:
template: templates/regression/quarterly-report.md
sections: [executive, trends, strategic, investments]
```
### Data Aggregation
```yaml
aggregation_config:
sources:
- bisect_reports: .aiwg/testing/regression-bisect-*/
- baselines: .aiwg/testing/baseline-comparisons/
- metrics: .aiwg/testing/regression-metrics-dashboard.md
- issues: github_issues
- incidents: .aiwg/incidents/
correlation_rules:
- link_bisect_to_issue
- map_component_to_owner
- calculate_business_impact
```
## Output Locations
- Reports: `.aiwg/testing/regression-report-{period}.md`
- Postmortems: `.aiwg/testing/regression-postmortem-{issue}.md`
- Quarterly: `.aiwg/testing/regression-analysis-Q{n}-{year}.md`
## References
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/test/regression-report-schema.yaml
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/templates/regression/incident-report.md
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/agents/technical-writer.md
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/agents/metrics-analyst.mdRelated Skills
uat-report
Generate UAT completion report with tool coverage matrix, pass/fail metrics, and regression detection
sdlc-reports
Generate SDLC reports including iteration status, metrics dashboards, and executive summaries across phases
regression-visual
Detect visual and UI regressions through screenshot comparison and pixel-diff analysis across browsers and viewports
regression-performance
Detect performance regressions by comparing benchmarks across versions with latency, throughput, and statistical significance analysis
regression-metrics
Track and analyze regression statistics, trends, hotspots, and health indicators across test suites
regression-learning
Improve regression detection over time through cross-task pattern recognition, test prioritization, and historical analysis
regression-cicd-hooks
Integrate regression testing into CI/CD pipelines with baseline comparison and merge blocking on failure
regression-check
Compare current behavior against baseline to detect regressions
regression-bisect
Identify the commit that introduced a regression using git bisect with automated test execution and blame context
regression-baseline
Create and maintain regression test baselines for comparison and drift detection across versioned snapshots
provenance-report
Generate provenance coverage dashboard and statistics
mention-report
Generate traceability report from @-mentions