regression-learning
Improve regression detection over time through cross-task pattern recognition, test prioritization, and historical analysis
Best use case
regression-learning is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
It is a strong fit for teams already working in Codex.
Improve regression detection over time through cross-task pattern recognition, test prioritization, and historical analysis
Teams using regression-learning should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/regression-learning/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How regression-learning Compares
| Feature / Agent | regression-learning | Standard Approach |
|---|---|---|
| Platform Support | Codex | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Improve regression detection over time through cross-task pattern recognition, test prioritization, and historical analysis
Which AI agents support this skill?
This skill is designed for Codex.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agent for Product Research
Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.
SKILL.md Source
# regression-learning
Cross-task learning skill for improving regression detection over time through pattern recognition, test prioritization, and historical analysis.
## Triggers
Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):
- "learn from regressions" → adaptive pattern recognition
- "improve detection accuracy" → regression ML tuning
## Purpose
This skill integrates with Al's cross-task learning system to continuously improve regression detection by:
- Learning from past regression patterns (REG-XXXX records)
- Identifying high-risk code patterns and hot spots
- Prioritizing tests by regression likelihood
- Building a taxonomy of root causes
- Suggesting fixes based on past resolutions
- Adapting test strategies based on effectiveness
Based on REF-013 MetaGPT's executable feedback pattern, this skill creates a persistent memory that accumulates knowledge across iterations and projects.
## Behavior
When triggered, this skill:
1. **Pattern Recognition**:
- Analyze historical regression records
- Identify recurring error patterns
- Map code patterns to failure types
- Detect hot spots in codebase
- Build regression taxonomy
2. **Test Prioritization**:
- Rank tests by regression detection likelihood
- Prioritize based on:
- Historical failure correlation
- Code change patterns
- Component risk scores
- Recent regression trends
- Generate optimal test execution order
3. **Root Cause Memory**:
- Store root cause analyses from REG-XXXX
- Link similar regressions across time
- Build fix template library
- Track fix effectiveness
- Enable cross-project learning
4. **Predictive Analysis**:
- Predict regression probability for code changes
- Flag high-risk modifications
- Suggest additional test coverage
- Recommend preventive actions
5. **Feedback Loop**:
- Learn from false positives/negatives
- Adjust confidence thresholds
- Incorporate human corrections
- Track prediction accuracy
- Improve over time
## Learning Schema
### Learning Record
```yaml
# .aiwg/ralph/learning/regression-patterns.yaml
learning_records:
- pattern_id: "RP-001"
pattern_name: "null_access_without_check"
category: null_undefined_access
occurrences:
- regression_id: "REG-0023"
file: "src/auth/validate.ts"
root_cause: "Missing null check before property access"
fix: "Added early return for null/undefined"
effectiveness: 1.0 # 100% success rate
- regression_id: "REG-0047"
file: "src/api/users.ts"
root_cause: "Null user object accessed"
fix: "Added null guard clause"
effectiveness: 1.0
- regression_id: "REG-0091"
file: "src/payments/process.ts"
root_cause: "Transaction object null"
fix: "Added null check with error throw"
effectiveness: 1.0
statistics:
total_occurrences: 3
avg_fix_time_hours: 1.5
recurrence_rate: 0.0 # 0% recurrence after fix
detection_methods:
automated_test: 3
production: 0
fix_template:
pattern: |
// Before: obj.property
// After:
if (!obj) {
throw new Error('Object is null or undefined');
}
// Or: if (!obj) return defaultValue;
applicability_rules:
- language: typescript
- pattern: "Cannot read property .* of null"
- context: function_entry
confidence: 0.95
last_updated: "2026-01-28T15:30:00Z"
```
### Hot Spot Tracking
```yaml
# .aiwg/ralph/learning/code-hotspots.yaml
hotspots:
- file: "src/auth/validate.ts"
risk_score: 8.5 # 0-10 scale
regression_history:
- REG-0023: null_access_without_check
- REG-0056: type_mismatch
- REG-0112: missing_validation
total_regressions: 3
regressions_per_kloc: 2.1 # Per 1000 lines of code
recent_changes:
last_30_days: 8
last_7_days: 2
contributors: 3
risk_factors:
high_complexity: true
frequent_changes: true
multiple_regressions: true
critical_path: true
recommended_actions:
- "Increase test coverage (current: 65%, target: 80%)"
- "Add null checks at function entry"
- "Refactor complex validation logic"
- "Add integration tests"
test_priority: critical # Run tests for this file first
```
### Test Effectiveness Tracking
```yaml
# .aiwg/ralph/learning/test-effectiveness.yaml
test_effectiveness:
- test_path: "test/auth/validate.test.ts"
regression_detection_rate: 0.85 # 85% of regressions caught
detected_regressions:
- REG-0023
- REG-0056
- REG-0112
missed_regressions:
- REG-0089: "Escaped to production"
false_positive_rate: 0.02 # 2% false alarms
execution_stats:
avg_duration_ms: 450
success_rate: 0.98
flakiness_score: 0.01 # Very stable
priority_score: 9.2 # High value test
recommendations:
- "Add edge case: empty string input"
- "Test concurrent validation requests"
```
## Regression Pattern Taxonomy
### Pattern Categories
| Category | Description | Detection | Fix Template | Effectiveness |
|----------|-------------|-----------|--------------|---------------|
| **null_undefined_access** | Accessing properties of null/undefined | `Cannot read property .* of null` | Add null check | 95% |
| **type_mismatch** | Type coercion or wrong type | `Expected .* but got .*` | Add type validation | 90% |
| **off_by_one** | Array bounds or loop conditions | `Index out of bounds` | Fix boundary conditions | 88% |
| **race_condition** | Timing-dependent failures | `Intermittent failures` | Add synchronization | 75% |
| **missing_validation** | Inputs not validated | `Validation error` | Add input checks | 92% |
| **logic_error** | Incorrect business logic | `Wrong behavior` | Fix logic | 85% |
| **resource_leak** | Resources not released | `Memory leak / handles` | Add cleanup | 80% |
| **api_misuse** | Incorrect API usage | `API error` | Fix API call | 90% |
| **configuration_error** | Wrong config values | `Config mismatch` | Update config | 95% |
## Test Prioritization Algorithm
```yaml
test_prioritization:
algorithm: weighted_risk_score
factors:
code_change_correlation: 0.30 # How often this test catches changes
historical_regression_detection: 0.25 # Past regression detection rate
code_hotspot_coverage: 0.20 # Tests high-risk code
recent_failure_trend: 0.15 # Recently failing pattern
execution_efficiency: 0.10 # Fast tests preferred
output:
- test_priority: critical | high | medium | low
- execution_order: [test1, test2, ...]
- skip_recommendations: [low-value tests]
- expand_coverage: [high-risk areas]
```
### Example Prioritization
```yaml
# After code change to src/auth/validate.ts
prioritized_tests:
critical: # Run first
- test/auth/validate.test.ts
- test/integration/auth-flow.test.ts
reason: "High regression detection rate for this file"
high: # Run next
- test/api/login.test.ts
- test/api/register.test.ts
reason: "Depends on auth module"
medium: # Run if time allows
- test/unit/utils.test.ts
reason: "Low correlation with auth changes"
low: # Skip in fast-feedback mode
- test/ui/theme.test.ts
reason: "No historical correlation"
skip_in_fast_mode:
- test/e2e/full-flow.test.ts
reason: "Slow (5 min), low failure rate for auth changes"
```
## Predictive Regression Analysis
### Risk Prediction
```yaml
# On code change: src/payments/process.ts
regression_prediction:
file: "src/payments/process.ts"
change_type: modification
lines_changed: 45
risk_analysis:
overall_risk: HIGH # critical | high | medium | low
confidence: 0.82
risk_factors:
- factor: "Known hotspot (REG-0034, REG-0067)"
weight: 0.35
- factor: "Large change (45 lines)"
weight: 0.25
- factor: "Critical payment logic"
weight: 0.20
- factor: "Recent regression (REG-0103, 5 days ago)"
weight: 0.20
similar_past_changes:
- commit: "abc123"
changes: 38 lines
result: "Regression REG-0067"
time_to_detect: 2 hours
- commit: "def456"
changes: 52 lines
result: "Clean - no regressions"
recommendations:
- "Run full payment test suite (priority: critical)"
- "Add integration tests for new code paths"
- "Manual QA recommended before deploy"
- "Consider pair review with payments expert"
- "Check for null handling (pattern RP-001)"
suggested_tests:
- test/payments/process.test.ts
- test/integration/payment-flow.test.ts
- test/regression/REG-0067-refund-handling.test.ts
```
## Integration with Al
### Debug Memory Integration
```yaml
# Al uses this skill to improve across iterations
ralph_integration:
on_test_failure:
- query_similar_past_failures
- retrieve_fix_templates
- apply_learned_patterns
- update_pattern_effectiveness
on_test_success:
- record_successful_fix
- increment_fix_effectiveness
- update_confidence_scores
on_regression_detected:
- create_regression_record: REG-XXXX
- analyze_root_cause
- store_in_learning_memory
- update_hotspot_scores
- adjust_test_priorities
cross_iteration_learning:
enabled: true
memory_path: .aiwg/ralph/learning/
retention_days: 365
share_across_projects: true # Learn from all projects
```
### Learning Storage Structure
```
.aiwg/ralph/learning/
├── regression-patterns.yaml # Pattern taxonomy
├── code-hotspots.yaml # Risk scoring
├── test-effectiveness.yaml # Test value metrics
├── fix-templates/ # Reusable fix patterns
│ ├── null-check.template
│ ├── type-validation.template
│ └── error-handling.template
├── prediction-models/ # ML models (if applicable)
│ └── risk-predictor.json
└── analytics/
├── pattern-trends.json # Pattern frequency over time
└── effectiveness-metrics.json # Fix success rates
```
## Usage Examples
### Analyze Regression Patterns
```
User: "Analyze regression patterns"
Skill executes:
1. Load all REG-XXXX records from .aiwg/testing/regressions/
2. Group by root cause category
3. Identify recurring patterns
4. Build pattern taxonomy
5. Generate recommendations
Output:
"Regression Pattern Analysis
Total Regressions: 47 (last 90 days)
Top Patterns:
1. null_undefined_access (12 occurrences, 25.5%)
- Files: validate.ts, users.ts, process.ts
- Fix template success rate: 95%
- Avg time to fix: 1.5 hours
2. missing_validation (8 occurrences, 17.0%)
- Files: api/*.ts
- Fix template success rate: 92%
- Avg time to fix: 2.0 hours
3. type_mismatch (6 occurrences, 12.8%)
- Files: converters/*.ts
- Fix template success rate: 90%
- Avg time to fix: 1.0 hours
Code Hotspots:
- src/auth/validate.ts: 8.5 risk score (3 regressions)
- src/api/users.ts: 7.2 risk score (2 regressions)
- src/payments/process.ts: 8.0 risk score (2 regressions)
Recommendations:
- Increase test coverage in auth module (current: 65%)
- Add null checks at function boundaries
- Implement stricter TypeScript settings
- Refactor validate.ts (high complexity)
See full report: .aiwg/ralph/learning/pattern-analysis.md"
```
### Prioritize Tests for Change
```
User: "What tests should I run after changing src/auth/validate.ts?"
Skill analyzes:
"Test Prioritization for src/auth/validate.ts
Based on learning history:
Critical Priority (run first):
✓ test/auth/validate.test.ts
- Regression detection rate: 85%
- Detected: REG-0023, REG-0056, REG-0112
- Duration: 450ms
✓ test/integration/auth-flow.test.ts
- Regression detection rate: 75%
- Detected: REG-0089
- Duration: 2.1s
High Priority:
- test/api/login.test.ts
- test/api/register.test.ts
Recommended Additions:
- Add test for null input handling (pattern RP-001)
- Add test for concurrent requests (gap identified)
Skip in Fast Mode:
- test/ui/theme.test.ts (no correlation)
- test/e2e/full-flow.test.ts (5 min, low value for this change)
Estimated total time:
- Critical tests: 2.6s
- + High priority: 5.2s
- Full suite: 45s"
```
### Predict Regression Risk
```
User: "Predict regression risk for my changes"
Skill analyzes current diff:
"Regression Risk Prediction
Files Changed:
- src/payments/process.ts (+45 lines)
- src/payments/refund.ts (+12 lines)
Risk Assessment: HIGH (confidence: 82%)
Risk Factors:
1. Known hotspot (8.0 risk score)
- Previous: REG-0034, REG-0067, REG-0103
2. Large change (45 lines in critical module)
3. Recent regression (REG-0103, 5 days ago)
4. Pattern match: Similar to commit abc123 → REG-0067
Predicted Failure Points:
- Null handling in refund flow (pattern RP-001)
- Type validation for amounts (pattern RP-005)
- Error handling for API timeouts (pattern RP-012)
Recommendations:
✓ Run full payment test suite (20 tests, 8.5s)
✓ Add integration test for refund edge cases
✓ Manual QA for payment flows
✓ Consider pair review with @payments-expert
Apply fix templates:
- null-check.template for refund object
- error-handling.template for API calls
Proceed with caution. Consider creating feature branch
for additional testing."
```
### Find Similar Past Failures
```
User: "Similar past failures for this error?"
Current error: "TypeError: Cannot read property 'amount' of undefined"
Skill searches learning memory:
"Similar Past Failures Found
Pattern: null_undefined_access (RP-001)
Occurrences: 12 total
Most Similar:
1. REG-0023: "Cannot read property 'length' of null"
- File: src/auth/validate.ts
- Fix: Added null check at function entry
- Time to fix: 1.2 hours
- Code:
```typescript
if (!input) {
throw new Error('Input is null or undefined');
}
```
2. REG-0067: "Cannot read property 'total' of undefined"
- File: src/payments/process.ts
- Fix: Added null guard with default value
- Time to fix: 0.8 hours
- Code:
```typescript
const total = transaction?.total ?? 0;
```
3. REG-0091: "Cannot read property 'id' of null"
- File: src/api/users.ts
- Fix: Early return for null case
- Time to fix: 0.5 hours
Fix Template (95% success rate):
```typescript
// Pattern: obj.property causing error
// Fix: Add null/undefined check
if (!obj) {
throw new Error('Object is required');
}
// OR
if (!obj) return defaultValue;
// OR
const value = obj?.property ?? defaultValue;
```
Apply this fix? [y/N]"
```
## Cross-Project Learning
### Shared Learning Pool
```yaml
# Enable learning across multiple projects
cross_project_learning:
enabled: true
shared_memory:
location: ~/.aiwg/global-learning/
shared_patterns:
- null_undefined_access
- type_mismatch
- missing_validation
- off_by_one
- race_condition
project_contributions:
- project: "ai-writing-guide"
patterns_contributed: 12
patterns_consumed: 8
- project: "payment-gateway"
patterns_contributed: 15
patterns_consumed: 12
privacy:
anonymize_file_paths: true
anonymize_business_logic: true
share_only_patterns: true # Not sensitive code
```
## Agent Orchestration
```yaml
agents:
pattern_analysis:
agent: test-architect
focus: Identify and categorize regression patterns
risk_prediction:
agent: reliability-engineer
focus: Predict regression likelihood
test_prioritization:
agent: test-engineer
focus: Optimize test execution order
fix_suggestion:
agent: software-implementer
focus: Apply learned fix templates
```
## Configuration
```yaml
# .aiwg/config/regression-learning.yaml
regression_learning:
enabled: true
learning_sources:
- .aiwg/testing/regressions/ # REG-XXXX records
- .aiwg/ralph/debug-memory/ # Execution history
- .aiwg/ralph/learning/ # Accumulated patterns
pattern_detection:
min_occurrences: 2 # Minimum to consider a pattern
confidence_threshold: 0.70
test_prioritization:
enabled: true
algorithm: weighted_risk_score
risk_prediction:
enabled: true
confidence_threshold: 0.75
cross_project_learning:
enabled: false # Privacy consideration
retention:
regression_records: 365 days
pattern_library: forever
effectiveness_metrics: 180 days
```
## Metrics
Track learning effectiveness:
| Metric | Target | Purpose |
|--------|--------|---------|
| pattern_accuracy | >85% | Pattern matching correct |
| fix_template_success_rate | >90% | Fixes work first time |
| prediction_accuracy | >75% | Risk predictions correct |
| test_prioritization_efficiency | >50% | Time saved by prioritization |
| false_positive_rate | <10% | Spurious alerts |
| cross_project_reuse | >30% | Patterns used across projects |
## Output Locations
- Learning memory: `.aiwg/ralph/learning/`
- Pattern analysis reports: `.aiwg/ralph/learning/analytics/`
- Fix templates: `.aiwg/ralph/learning/fix-templates/`
- Global learning: `~/.aiwg/global-learning/` (if enabled)
## References
- @$AIWG_ROOT/agentic/code/addons/ralph/schemas/debug-memory.yaml - Debug memory structure
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/testing/regression.yaml - Regression schema
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/executable-feedback.md - Feedback loop patterns
- @.aiwg/research/findings/REF-013-metagpt.md - MetaGPT executable feedback
- @.aiwg/research/findings/REF-015-self-refine.md - Self-Refine learning patterns
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/skills/regression-baseline/SKILL.md - Baseline management
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/skills/regression-bisect/SKILL.md - Root cause analysis
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/skills/regression-metrics/SKILL.md - Metrics trackingRelated Skills
regression-visual
Detect visual and UI regressions through screenshot comparison and pixel-diff analysis across browsers and viewports
regression-report
Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data with actionable recommendations
regression-performance
Detect performance regressions by comparing benchmarks across versions with latency, throughput, and statistical significance analysis
regression-metrics
Track and analyze regression statistics, trends, hotspots, and health indicators across test suites
regression-cicd-hooks
Integrate regression testing into CI/CD pipelines with baseline comparison and merge blocking on failure
regression-check
Compare current behavior against baseline to detect regressions
regression-bisect
Identify the commit that introduced a regression using git bisect with automated test execution and blame context
regression-baseline
Create and maintain regression test baselines for comparison and drift detection across versioned snapshots
aiwg-orchestrate
Route structured artifact work to AIWG workflows via MCP with zero parent context cost
venv-manager
Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.
pytest-runner
Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.
vitest-runner
Execute JavaScript/TypeScript tests with Vitest, supporting coverage, watch mode, and parallel execution. Use for JS/TS test automation.