code-review-quality

Conduct context-driven code reviews focusing on quality, testability, and maintainability. Use when reviewing code, providing feedback, or establishing review practices.

298 stars

byproffesor-for-testing

View on GitHub Installation ↓

Best use case

code-review-quality is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Conduct context-driven code reviews focusing on quality, testability, and maintainability. Use when reviewing code, providing feedback, or establishing review practices.

Teams using code-review-quality should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/code-review-quality/SKILL.md --create-dirs "https://raw.githubusercontent.com/proffesor-for-testing/agentic-qe/main/.claude/skills/code-review-quality/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/code-review-quality/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How code-review-quality Compares

Feature / Agent	code-review-quality	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Conduct context-driven code reviews focusing on quality, testability, and maintainability. Use when reviewing code, providing feedback, or establishing review practices.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# Code Review Quality

<default_to_action>
When reviewing code or establishing review practices:
1. PRIORITIZE feedback: 🔴 Blocker (must fix) → 🟡 Major → 🟢 Minor → 💡 Suggestion
2. FOCUS on: Bugs, security, testability, maintainability (not style preferences)
3. ASK questions over commands: "Have you considered...?" > "Change this to..."
4. PROVIDE context: Why this matters, not just what to change
5. LIMIT scope: Review < 400 lines at a time for effectiveness

**Quick Review Checklist:**
- Logic: Does it work correctly? Edge cases handled?
- Security: Input validation? Auth checks? Injection risks?
- Testability: Can this be tested? Is it tested?
- Maintainability: Clear naming? Single responsibility? DRY?
- Performance: O(n²) loops? N+1 queries? Memory leaks?

**Critical Success Factors:**
- Review the code, not the person
- Catching bugs > nitpicking style
- Fast feedback (< 24h) > thorough feedback
</default_to_action>

## Quick Reference Card

### When to Use
- PR code reviews
- Pair programming feedback
- Establishing team review standards
- Mentoring developers

### Feedback Priority Levels
| Level | Icon | Meaning | Action |
|-------|------|---------|--------|
| Blocker | 🔴 | Bug/security/crash | Must fix before merge |
| Major | 🟡 | Logic issue/test gap | Should fix before merge |
| Minor | 🟢 | Style/naming | Nice to fix |
| Suggestion | 💡 | Alternative approach | Consider for future |

### Review Scope Limits
| Lines Changed | Recommendation |
|---------------|----------------|
| < 200 | Single review session |
| 200-400 | Review in chunks |
| > 400 | Request PR split |

### What to Focus On
| ✅ Review | ❌ Skip |
|-----------|---------|
| Logic correctness | Formatting (use linter) |
| Security risks | Naming preferences |
| Test coverage | Architecture debates |
| Performance issues | Style opinions |
| Error handling | Trivial changes |

---

## Feedback Templates

### Blocker (Must Fix)
```markdown
🔴 **BLOCKER: SQL Injection Risk**

This query is vulnerable to SQL injection:
```javascript
db.query(`SELECT * FROM users WHERE id = ${userId}`)
```

**Fix:** Use parameterized queries:
```javascript
db.query('SELECT * FROM users WHERE id = ?', [userId])
```

**Why:** User input directly in SQL allows attackers to execute arbitrary queries.
```

### Major (Should Fix)
```markdown
🟡 **MAJOR: Missing Error Handling**

What happens if `fetchUser()` throws? The error bubbles up unhandled.

**Suggestion:** Add try/catch with appropriate error response:
```javascript
try {
  const user = await fetchUser(id);
  return user;
} catch (error) {
  logger.error('Failed to fetch user', { id, error });
  throw new NotFoundError('User not found');
}
```
```

### Minor (Nice to Fix)
```markdown
🟢 **minor:** Variable name could be clearer

`d` doesn't convey meaning. Consider `daysSinceLastLogin`.
```

### Suggestion (Consider)
```markdown
💡 **suggestion:** Consider extracting this to a helper

This validation logic appears in 3 places. A `validateEmail()` helper would reduce duplication. Not blocking, but might be worth a follow-up PR.
```

---

## Review Questions to Ask

### Logic
- What happens when X is null/empty/negative?
- Is there a race condition here?
- What if the API call fails?

### Security
- Is user input validated/sanitized?
- Are auth checks in place?
- Any secrets or PII exposed?

### Testability
- How would you test this?
- Are dependencies injectable?
- Is there a test for the happy path? Edge cases?

### Maintainability
- Will the next developer understand this?
- Is this doing too many things?
- Is there duplication we could reduce?

## Minimum Findings Enforcement
Reviews must meet a minimum weighted finding score of 3.0 (CRITICAL=3, HIGH=2, MEDIUM=1, LOW=0.5, INFORMATIONAL=0.25). If the initial review falls short, run the qe-devils-advocate agent as a meta-reviewer to find additional observations. Every review should have at least 3 actionable observations.

---

## Agent-Assisted Reviews

```typescript
// Comprehensive code review
await Task("Code Review", {
  prNumber: 123,
  checks: ['security', 'performance', 'testability', 'maintainability'],
  feedbackLevels: ['blocker', 'major', 'minor'],
  autoApprove: { maxBlockers: 0, maxMajor: 2 }
}, "qe-quality-analyzer");

// Security-focused review
await Task("Security Review", {
  prFiles: changedFiles,
  scanTypes: ['injection', 'auth', 'secrets', 'dependencies']
}, "qe-security-scanner");

// Test coverage review
await Task("Coverage Review", {
  prNumber: 123,
  requireNewTests: true,
  minCoverageDelta: 0
}, "qe-coverage-analyzer");
```

---

## Agent Coordination Hints

### Memory Namespace
```
aqe/code-review/
├── review-history/*     - Past review decisions
├── patterns/*           - Common issues by team/repo
├── feedback-templates/* - Reusable feedback
└── metrics/*            - Review turnaround time
```

### Fleet Coordination
```typescript
const reviewFleet = await FleetManager.coordinate({
  strategy: 'code-review',
  agents: [
    'qe-quality-analyzer',    // Logic, maintainability
    'qe-security-scanner',    // Security risks
    'qe-performance-tester',  // Performance issues
    'qe-coverage-analyzer'    // Test coverage
  ],
  topology: 'parallel'
});
```

---

## Review Etiquette

| ✅ Do | ❌ Don't |
|-------|---------|
| "Have you considered...?" | "This is wrong" |
| Explain why it matters | Just say "fix this" |
| Acknowledge good code | Only point out negatives |
| Suggest, don't demand | Be condescending |
| Review < 400 lines | Review 2000 lines at once |

---

## Related Skills
- [agentic-quality-engineering](../agentic-quality-engineering/) - Agent coordination
- [security-testing](../security-testing/) - Security review depth
- [refactoring-patterns](../refactoring-patterns/) - Maintainability patterns

---

## Remember

**Prioritize feedback:** 🔴 Blocker → 🟡 Major → 🟢 Minor → 💡 Suggestion. Focus on bugs and security, not style. Ask questions, don't command. Review < 400 lines at a time. Fast feedback (< 24h) beats thorough feedback.

**With Agents:** Agents automate security, performance, and coverage checks, freeing human reviewers to focus on logic and design. Use agents for consistent, fast initial review.

## Skill Composition

- **Security concerns** → Compose with `/security-testing` for security-focused review
- **Coverage check** → Run `/qe-coverage-analysis` on changed files
- **Ship decision** → Feed review results into `/qe-quality-assessment`

## Gotchas

- Agent reviews >400 lines at once and misses issues — chunk reviews to 200-400 lines maximum
- Nitpicking style while missing logic bugs is the #1 agent review failure — prioritize correctness over formatting
- Agent approves code that compiles but has subtle race conditions — always check shared state and async patterns
- Review comments without suggested fixes are unhelpful — always include a proposed alternative
- Agent doesn't check if the PR actually solves the linked issue — verify the stated problem is actually fixed

Related Skills

qe-verification-quality

298

from proffesor-for-testing/agentic-qe

Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.

qe-sherlock-review

298

from proffesor-for-testing/agentic-qe

Evidence-based investigative code review using deductive reasoning to determine what actually happened versus what was claimed. Use when verifying implementation claims, investigating bugs, validating fixes, or conducting root cause analysis. Elementary approach to finding truth through systematic observation.

qe-quality-metrics

298

from proffesor-for-testing/agentic-qe

Measure quality effectively with actionable metrics. Use when establishing quality dashboards, defining KPIs, or evaluating test effectiveness.

qe-pr-review

298

from proffesor-for-testing/agentic-qe

Scope-aware GitHub PR review with user-friendly tone and trust tier validation

qe-github-code-review

298

from proffesor-for-testing/agentic-qe

Comprehensive GitHub code review with AI-powered swarm coordination

qe-code-review-quality

298

from proffesor-for-testing/agentic-qe

Conduct context-driven code reviews focusing on quality, testability, and maintainability. Use when reviewing code, providing feedback, or establishing review practices.

qe-brutal-honesty-review

298

from proffesor-for-testing/agentic-qe

Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why.

qe-agentic-quality-engineering

298

from proffesor-for-testing/agentic-qe

AI agents as force multipliers for quality work. Core skill for all 19 QE agents using PACT principles.

verification-quality

298

from proffesor-for-testing/agentic-qe

Verifies agent outputs against expected results and validates code changes pass quality checks before merge. Use when verifying agent outputs are correct, validating code changes before merge, or configuring automatic rollback for failed quality checks.

sherlock-review

298

from proffesor-for-testing/agentic-qe

quality-metrics

298

from proffesor-for-testing/agentic-qe

Tracks quality metrics including defect density, test effectiveness ratio, DORA metrics, and mean time to detection. Use when establishing quality dashboards, defining KPIs, evaluating test suite effectiveness, or reporting quality trends to stakeholders.

qe-quality-assessment

298

from proffesor-for-testing/agentic-qe

Evaluates code quality through complexity analysis, lint results, code smell detection, and test health metrics. Use when assessing deployment readiness, configuring quality gates, scoring a codebase for release, or generating quality reports with pass/fail verdicts.