dev-tech-debt-review

Detect AI/agentic-specific anti-patterns that traditional linters miss. Analyzes tool/agent boundary violations, prompt debt, context window issues, testing patterns, and more. Returns scored findings with remediation guidance.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

dev-tech-debt-review is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using dev-tech-debt-review should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/dev-tech-debt-review/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/dev-tech-debt-review/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/dev-tech-debt-review/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How dev-tech-debt-review Compares

Feature / Agent	dev-tech-debt-review	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# dev.tech-debt-review

> Detect AI/agentic-specific tech debt that traditional linters miss.

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Usage Modes

| Mode | Command | Scope | When to Use |
|------|---------|-------|-------------|
| **Changed Files** | `/dev.tech-debt-review` | Files changed vs main | Before PR (default) |
| **Full Audit** | `/dev.tech-debt-review --all` | Full src/ directory | Release gate, quarterly review |
| **Category Focus** | `/dev.tech-debt-review --category=boundary` | Specific category | Targeted cleanup |

**Categories**: `boundary`, `prompt`, `context`, `testing`, `error`, `state`, `observability`, `subagent`

---

## Goal

Detect AI/agentic-specific anti-patterns that traditional linters miss:

| Category | What It Finds | Why It Matters |
|----------|---------------|----------------|
| **Tool/Agent Boundary** | Tools making decisions, encoding thresholds | Violates Constitution VII |
| **Prompt Debt** | Hardcoded prompts, no versioning, injection risks | Maintenance nightmare |
| **Context Window** | Unbounded data, no truncation, raw dumps | Performance/reliability |
| **Testing** | Live LLM in unit tests, missing VCR, no evals | CI flakiness, cost |
| **Error Handling** | No retry/backoff, missing timeouts | Reliability |
| **State Management** | Unbounded conversation history | Memory leaks |
| **Observability** | No tracing, missing metrics | Debugging blind spots |
| **Subagent** | Depth > 1, unclear boundaries | Architecture violation |

Traditional linters catch syntax and style. This skill catches **semantic anti-patterns** specific to agentic systems.

---

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output analysis and recommendations.

**SEMANTIC ANALYSIS**: Read and understand code patterns, don't just grep.

**EVIDENCE-BASED**: Every finding must cite file:line and code snippet.

**SCORING**: Aggregate findings into 0-100 score for PR gate decisions.

---

## Constitution Alignment

This skill validates adherence to project principles:

| Principle | What We Check |
|-----------|---------------|
| **I. Local-First** | No data transmission, security of prompts |
| **III. Causal-First** | Recommendations trace to evidence |
| **VII. Intelligent Tooling** | Tools provide data, agent provides judgment |
| **VIII. Compounding Value** | Historical tracking of tech debt |
| **IX. Agent-Aware** | Context optimized for agent cognition |

Reference: [Constitution Mapping](references/constitution-mapping.md)

---

## Execution Steps

### Phase 0: Scope Determination

**You handle this phase directly.**

**Parse user input to determine scope:**

1. **If `--all` flag present**: Full codebase audit
   ```bash
   find src -name "*.ts" -type f | grep -v ".d.ts" | grep -v "node_modules"
   ```

2. **If `--category=X` flag present**: Filter to that category's patterns only

3. **If specific file path provided**: Analyze that file
   ```bash
   ls -la <provided-path>
   ```

4. **Default (no args)**: Changed files only
   ```bash
   git diff --name-only main...HEAD | grep -E '\.tsx?$'
   ```

**Report mode to user:**
- `--all` mode: "Running FULL CODEBASE audit on N files"
- `--category=X`: "Running targeted audit for category: X"
- Default: "Reviewing N files changed vs main"

**Load context:**
- Read `.specify/memory/constitution.md` for principle references
- Check for previous review at `.claude/reviews/tech-debt-*.json` for trend comparison

---

### Phase 1: Parallel Category Analysis (Subagents)

**Invoke 8 category-specific analyzers IN PARALLEL (single message, multiple Task calls).**

Each subagent focuses on one category. Use **Haiku** for single-file analysis, **Sonnet** for cross-file patterns.

```
Task(boundary-analyzer, model=haiku, "Analyze files for Tool/Agent Boundary violations.

PATTERNS TO DETECT:
- JUDGMENT_IN_TOOL: Functions that return judgments ('good', 'bad', 'low', 'high')
- ORCHESTRATION_IN_TOOL: Tools that decide when/how to call other tools
- THRESHOLD_ENCODING: Hardcoded thresholds that encode decisions (if score < 0.3)
- DECISION_RETURN: Functions returning 'suggestedAction', 'shouldDo', etc.

FILES: [list]

For each finding, return:
- pattern: The pattern code (e.g., JUDGMENT_IN_TOOL)
- file: Absolute path
- line: Line number
- snippet: The offending code (max 5 lines)
- evidence: Why this matches the pattern

Return JSON array of findings. Return empty array if no issues.")

Task(prompt-analyzer, model=haiku, "Analyze files for Prompt Debt.

PATTERNS TO DETECT:
- HARDCODED_PROMPT: Prompt strings embedded directly in code (not in prompt files)
- NO_VERSIONING: Prompts without version tracking or timestamps
- INJECTION_RISK: User input concatenated into prompts without sanitization
- PROMPT_SPRAWL: Same prompt logic duplicated in multiple places

FILES: [list]

Return JSON array of findings with pattern, file, line, snippet, evidence.")

Task(context-analyzer, model=haiku, "Analyze files for Context Window issues.

PATTERNS TO DETECT:
- UNBOUNDED_DATA: Arrays/objects passed to agent without size limits
- NO_SUMMARIZATION: Large data returned without truncation/summarization
- RAW_DUMP: Tool returning raw data structures instead of formatted summaries
- CONTEXT_BLOAT: Unnecessary verbose output in tool responses

FILES: [list]

Return JSON array of findings with pattern, file, line, snippet, evidence.")

Task(testing-analyzer, model=haiku, "Analyze files for Testing Anti-patterns.

PATTERNS TO DETECT:
- LIVE_LLM_IN_UNIT: Unit tests that make real LLM API calls
- MISSING_VCR: Integration tests without VCR recordings
- MISSING_EVAL: Behavioral scenarios without TruLens evals
- FLAKY_ASSERTION: Tests asserting on non-deterministic LLM output
- NO_MOCK: Tests calling real external services

FILES: [list]
ADR-0011 Reference: Unit=mocked, Integration=VCR, E2E/Evals=live

Return JSON array of findings with pattern, file, line, snippet, evidence.")

Task(error-analyzer, model=haiku, "Analyze files for Error Handling issues.

PATTERNS TO DETECT:
- NO_RETRY_LOGIC: LLM API calls without retry/backoff
- MISSING_TIMEOUT: API calls without timeout configuration
- SILENT_FAILURE: Errors caught but not logged/propagated
- UNHANDLED_REJECTION: Async operations without error handling
- GENERIC_CATCH: catch(e) without specific error handling

FILES: [list]

Return JSON array of findings with pattern, file, line, snippet, evidence.")

Task(state-analyzer, model=haiku, "Analyze files for State Management issues.

PATTERNS TO DETECT:
- UNBOUNDED_HISTORY: Conversation history without max length
- MEMORY_LEAK: Growing state structures without cleanup
- NO_CHECKPOINT: Long-running operations without checkpointing
- STALE_STATE: Cached state without invalidation strategy

FILES: [list]

Return JSON array of findings with pattern, file, line, snippet, evidence.")

Task(observability-analyzer, model=haiku, "Analyze files for Observability gaps.

PATTERNS TO DETECT:
- NO_TRACING: LLM calls without trace/span context
- MISSING_METRICS: No token counting, latency tracking
- NO_DEBUG_LOG: Complex operations without debug logging
- OPAQUE_ERROR: Errors without context for debugging

FILES: [list]

Return JSON array of findings with pattern, file, line, snippet, evidence.")

Task(subagent-analyzer, model=sonnet, "Analyze files for Subagent Architecture violations.

PATTERNS TO DETECT:
- DEPTH_VIOLATION: Subagents spawning sub-subagents (depth > 1)
- UNCLEAR_BOUNDARY: Subagent with multiple unrelated responsibilities
- MISSING_ISOLATION: Subagent sharing mutable state with parent
- RECURSIVE_SPAWN: Agents that can spawn themselves

FILES: [list]
Constitution C8 Reference: Subagent depth limited to 1

Return JSON array of findings with pattern, file, line, snippet, evidence.")
```

**Wait for all subagents to return.**

---

### Phase 2: Constitution Cross-Reference

**You handle this phase directly.**

For each finding from Phase 1, map to Constitution principles:

Reference: [Constitution Mapping](references/constitution-mapping.md)

| Pattern | Principle | Severity Modifier |
|---------|-----------|-------------------|
| JUDGMENT_IN_TOOL | VII | +2 (direct violation) |
| ORCHESTRATION_IN_TOOL | VII | +3 (critical) |
| INJECTION_RISK | I | +5 (security) |
| DEPTH_VIOLATION | C8 | +4 (architecture) |
| LIVE_LLM_IN_UNIT | ADR-0011 | +3 (testing strategy) |

---

### Phase 3: Severity Scoring

**You handle this phase directly.**

**Base deductions per finding type:**

Reference: [Severity Criteria](references/severity-criteria.md)

| Category | Pattern | Base Deduction |
|----------|---------|----------------|
| Boundary | JUDGMENT_IN_TOOL | -8 |
| Boundary | ORCHESTRATION_IN_TOOL | -10 |
| Boundary | THRESHOLD_ENCODING | -6 |
| Prompt | HARDCODED_PROMPT | -3 |
| Prompt | INJECTION_RISK | -15 |
| Prompt | PROMPT_SPRAWL | -4 |
| Context | UNBOUNDED_DATA | -7 |
| Context | RAW_DUMP | -5 |
| Testing | LIVE_LLM_IN_UNIT | -10 |
| Testing | MISSING_VCR | -6 |
| Testing | MISSING_EVAL | -8 |
| Error | NO_RETRY_LOGIC | -6 |
| Error | SILENT_FAILURE | -8 |
| State | UNBOUNDED_HISTORY | -10 |
| State | NO_CHECKPOINT | -5 |
| Observability | NO_TRACING | -5 |
| Observability | MISSING_METRICS | -4 |
| Subagent | DEPTH_VIOLATION | -12 |

**Scoring formula:**
```
score = max(0, 100 - sum(deductions))
```

**Score interpretation:**

| Score | Grade | Action |
|-------|-------|--------|
| 90-100 | A | Ship it |
| 80-89 | B | Minor cleanup before merge |
| 70-79 | C | Plan remediation sprint |
| 60-69 | D | Urgent fixes needed |
| < 60 | F | Block release |

---

### Phase 4: Generate Report

**You handle this phase directly.**

**Output format:**

```markdown
## Tech Debt Review Report

**Score**: XX/100 (Grade: X)
**Trend**: [+/-N vs last review] or [First review - no trend data]
**Files Analyzed**: N
**Scope**: [changed files | full codebase | category: X]

---

### Executive Summary

| Category | Findings | Impact | Constitution |
|----------|----------|--------|--------------|
| Boundary | N | -XX | VII |
| Testing | N | -XX | ADR-0011 |
| ... | ... | ... | ... |
| **Total** | **N** | **-XX** | |

---

### P0 - Block Release

These issues MUST be fixed before merge:

#### [PATTERN_CODE] in file.ts:123

**Code:**
```typescript
// The offending code snippet
```

**Problem**: [Why this is an anti-pattern]

**Constitution**: Violates Principle [X]

**Remediation**: [Specific fix guidance]

---

### P1 - This Sprint

[Summary findings - less detail than P0]

---

### P2 - Backlog

| Pattern | File | Line | Category |
|---------|------|------|----------|
| ... | ... | ... | ... |

---

### Historical Trend

| Date | Score | Grade | Change |
|------|-------|-------|--------|
| [today] | XX | X | - |
| [prev] | XX | X | +/-N |

---

### Anti-Pattern Reference

See: [AI Anti-Patterns Guide](references/ai-anti-patterns.md)
```

---

### Phase 5: Save Historical Data

**You handle this phase directly.**

Save review results to `.claude/reviews/tech-debt-YYYY-MM-DD.json`:

```json
{
  "version": "1.0.0",
  "timestamp": "2026-01-26T14:30:00Z",
  "scope": "changed_files|all|category:X",
  "filesAnalyzed": 145,
  "score": 78,
  "grade": "C",
  "categories": {
    "boundary": { "findings": 3, "impact": -23 },
    "testing": { "findings": 5, "impact": -30 },
    ...
  },
  "findings": [
    {
      "pattern": "JUDGMENT_IN_TOOL",
      "category": "boundary",
      "file": "src/tools/analyzer.ts",
      "line": 42,
      "snippet": "...",
      "severity": 8
    }
  ],
  "trend": {
    "previousScore": 72,
    "previousDate": "2026-01-19",
    "change": 6
  }
}
```

---

## Integration with Other Skills

| Skill | Relationship |
|-------|--------------|
| `/dev.test-review` | Complementary - test-review checks test quality, tech-debt-review checks testing anti-patterns |
| `/dev.verify-wiring` | Sequential - verify-wiring first, then tech-debt-review |
| `/arch-review` | Parallel - different focus areas |
| `/dev.pr` | Gate - score threshold can block PR |

**Suggested workflow:**
```
/dev.verify-wiring → /dev.tech-debt-review → /dev.test-review → /dev.pr
```

---

## Red Flags (Auto-Fail)

These patterns automatically set grade to F:

| Pattern | Why |
|---------|-----|
| INJECTION_RISK | Security vulnerability |
| DEPTH_VIOLATION | Architecture violation |
| Multiple ORCHESTRATION_IN_TOOL | Fundamental design issue |

---

## References

- [AI Anti-Patterns Guide](references/ai-anti-patterns.md)
- [Constitution Mapping](references/constitution-mapping.md)
- [Severity Criteria](references/severity-criteria.md)
- [ADR-0011: Testing Strategy](../../docs/architecture/adr/0011-testing-strategy-for-agentic-components.md)
- [ADR-0019: Tool/Agent Boundary](../../docs/architecture/adr/0019-tool-agent-boundary-temporal.md)
- [Constitution](../../.specify/memory/constitution.md)

---

## Handoff

After running this skill:

- **Grade A/B**: Proceed to `/dev.pr`
- **Grade C**: Review findings, plan fixes, proceed if time-boxed
- **Grade D/F**: Fix critical issues, re-run before PR

Related Skills

ethics-reviewer

from diegosouzapw/awesome-omni-skill

This skill should be used when the user mentions "dark patterns", "accessibility", "a11y", "privacy", "tracking", "analytics", "notifications", "user data", "GDPR", "consent", "manipulation", "sustainability", "performance budget", or when building user-facing features that collect data, send notifications, display urgency, or gate access. Addresses ethical constraints in software design — manipulation, accessibility, privacy, and sustainability.

error-debugging-multi-agent-review

from diegosouzapw/awesome-omni-skill

Use when working with error debugging multi agent review

datahub-connector-pr-review

from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "review my connector", "check my datahub connector", "review connector code", "audit connector", "review PR", "check code quality", or any request to review/check/audit a DataHub ingestion source. Covers compliance with standards, best practices, testing quality, and merge readiness.

cursor-rules-review

from diegosouzapw/awesome-omni-skill

Audit Cursor IDE rules (.mdc files) against quality standards using a 5-gate review process. Validates frontmatter (YAML syntax, required fields, description quality, triggering configuration), glob patterns (specificity, performance, correctness), content quality (focus, organization, examples, cross-references), file length (under 500 lines recommended), and functionality (triggering, cross-references, maintainability). Use when reviewing pull requests with Cursor rule changes, conducting periodic rule quality audits, validating new rules before committing, identifying improvement opportunities, preparing rules for team sharing, or debugging why rules aren't working as expected.

cpm:review

from diegosouzapw/awesome-omni-skill

Adversarial review of epic docs and stories. Agents from the party roster examine planning artifacts through their professional lens, challenging assumptions, spotting gaps, and flagging risks. Triggers on "/cpm:review".

contract-review-pro

from diegosouzapw/awesome-omni-skill

专业合同审核 Skill，基于《合同审核方法论体系》提供合同类型指引和详细审核服务

codex-reviewer

from diegosouzapw/awesome-omni-skill

Use OpenAI's Codex CLI as an independent code reviewer to provide second opinions on code implementations, architectural decisions, code specifications, and pull requests. Trigger when users request code review, second opinion, independent review, architecture validation, or mention Codex review. Provides unbiased analysis using GPT-5-Codex model through the codex exec command for non-interactive reviews.

codex-review

from diegosouzapw/awesome-omni-skill

Two-pass adversarial review of design documents and implementation plans using OpenAI Codex CLI. Invokes Codex to review plans section-by-section (pass 1), then holistically (pass 2), feeding critique back for revision. Use when you have a design doc, architecture plan, or implementation plan that should be stress-tested before execution.

code-reviewer

from diegosouzapw/awesome-omni-skill

Elite code review expert specializing in modern AI-powered code analysis, security vulnerabilities, performance optimization, and production reliability. Masters static analysis tools, security scanning, and configuration review with 2024/2025 best practices. Use PROACTIVELY for code quality assurance.

code-review-agent

from diegosouzapw/awesome-omni-skill

Comprehensive security and quality code review agent that checks for OWASP vulnerabilities, GDPR compliance, accessibility standards, and code quality issues.

book-review

from diegosouzapw/awesome-omni-skill

Review any part of the book using parallel specialist agents

banking-domain-reviewer

from diegosouzapw/awesome-omni-skill

Code review agent with banking domain knowledge — validates business flows, compliance requirements, double-entry accounting, payment processing, and regulatory patterns in the Firefly Banking Platform