ln-633-test-value-auditor

Scores each test by Impact x Probability, returns KEEP/REVIEW/REMOVE decisions. Use when auditing test value and pruning low-value tests.

310 stars

bylevnikolaevich

View on GitHub Installation ↓

Best use case

ln-633-test-value-auditor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Scores each test by Impact x Probability, returns KEEP/REVIEW/REMOVE decisions. Use when auditing test value and pruning low-value tests.

Teams using ln-633-test-value-auditor should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ln-633-test-value-auditor/SKILL.md --create-dirs "https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/main/skills-catalog/ln-633-test-value-auditor/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ln-633-test-value-auditor/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ln-633-test-value-auditor Compares

Feature / Agent	ln-633-test-value-auditor	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Scores each test by Impact x Probability, returns KEEP/REVIEW/REMOVE decisions. Use when auditing test value and pruning low-value tests.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

> **Paths:** File paths (`shared/`, `references/`, `../ln-*`) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root. If `shared/` is missing, fetch files via WebFetch from `https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}`.

# Risk-Based Value Auditor (L3 Worker)

**Type:** L3 Worker

Specialized worker calculating Usefulness Score for each test.

## Purpose & Scope

- **Worker in ln-630 coordinator pipeline**
- Audit **Risk-Based Value** (Category 3: Critical Priority)
- Calculate Usefulness Score = Impact x Probability
- Make KEEP/REVIEW/REMOVE decisions
- Calculate compliance score (X/10)

## Inputs (from Coordinator)

**MANDATORY READ:** Load `shared/references/audit_worker_core_contract.md`.

Receives `contextStore` with: `tech_stack`, `testFilesMetadata`, `codebase_root`, `output_dir`.

## Workflow

**MANDATORY READ:** Load `shared/references/two_layer_detection.md` for detection methodology.

1) **Parse Context:** Extract tech stack, Impact/Probability matrices, test file list, output_dir from contextStore
2) **Calculate Scores (Layer 1):** For each test: calculate Usefulness Score = Impact x Probability
2b) **Context Analysis (Layer 2 -- MANDATORY):** Before finalizing REMOVE decisions, ask:
   - Is this a regression guard for a known past bug? -> **KEEP** regardless of Score
   - Does this test cover a critical business rule (payment, auth) even if Score<10? -> **REVIEW**, not REMOVE
   - Is this the only test covering an edge case in a critical flow? -> **KEEP**
3) **Classify Decisions:** KEEP (>=15), REVIEW (10-14), REMOVE (<10)
4) **Collect Findings:** Record each REVIEW/REMOVE decision with severity, location (file:line), effort estimate (S/M/L), recommendation
5) **Calculate Score:** Count violations by severity, calculate compliance score (X/10)
6) **Write Report:** Build full markdown report in memory per `shared/templates/audit_worker_report_template.md`, write to `{output_dir}/633-test-value.md` in single Write call
7) **Return Summary:** Return minimal summary to coordinator (see Output Format)

## Usefulness Score Calculation

### Formula

```
Usefulness Score = Business Impact (1-5) x Failure Probability (1-5)
```

### Impact Scoring (1-5)

| Score | Impact | Examples |
|-------|--------|----------|
| **5** | **Critical** | Money loss, security breach, data corruption |
| **4** | **High** | Core flow breaks (checkout, login, registration) |
| **3** | **Medium** | Feature partially broken, degraded UX |
| **2** | **Low** | Minor UX issue, cosmetic bug |
| **1** | **Trivial** | Cosmetic issue, no user impact |

### Probability Scoring (1-5)

| Score | Probability | Indicators |
|-------|-------------|------------|
| **5** | **Very High** | Complex algorithm, new technology, many dependencies |
| **4** | **High** | Multiple dependencies, concurrency, edge cases |
| **3** | **Medium** | Standard CRUD, framework defaults, established patterns |
| **2** | **Low** | Simple logic, well-established library, trivial operation |
| **1** | **Very Low** | Trivial assignment, framework-generated, impossible to break |

### Decision Thresholds

| Score Range | Decision | Action |
|-------------|----------|--------|
| **>=15** | **KEEP** | Test is valuable, maintain it |
| **10-14** | **REVIEW** | Consider if E2E already covers this |
| **<10** | **REMOVE** | Delete test, not worth maintenance cost. **Exception:** regression guards for known bugs -> KEEP. Tests covering critical business rules (payment, auth) -> REVIEW |

## Scoring Examples

### Example 1: Payment Processing Test

```
Test: "processPayment calculates discount correctly"
Impact: 5 (Critical -- money calculation)
Probability: 4 (High -- complex algorithm, multiple payment gateways)
Usefulness Score = 5 x 4 = 20
Decision: KEEP
```

### Example 2: Email Validation Test

```
Test: "validateEmail returns true for valid email"
Impact: 2 (Low -- minor UX issue if broken)
Probability: 2 (Low -- simple regex, well-tested library)
Usefulness Score = 2 x 2 = 4
Decision: REMOVE (likely already covered by E2E registration test)
```

### Example 3: Login Flow Test

```
Test: "login with valid credentials returns JWT"
Impact: 4 (High -- core flow)
Probability: 3 (Medium -- standard auth flow)
Usefulness Score = 4 x 3 = 12
Decision: REVIEW (if E2E covers, remove; else keep)
```

## Audit Rules

### 1. Calculate Score for Each Test

**Process:**
- Read test file, extract test name/description
- Analyze code under test (CUT)
- Determine Impact (1-5)
- Determine Probability (1-5)
- Calculate Usefulness Score

### 2. Classify Decisions

**KEEP (>=15):**
- High-value tests (money, security, data integrity)
- Core flows (checkout, login)
- Complex algorithms

**REVIEW (10-14):**
- Medium-value tests
- Question: "Is this already covered by E2E?"
- If yes -> REMOVE; if no -> KEEP

**REMOVE (<10):**
- Low-value tests (cosmetic, trivial)
- Framework/library tests
- Duplicates of E2E tests

### 3. Identify Patterns

**Common low-value tests (<10):**
- Testing framework behavior
- Testing trivial getters/setters
- Testing constant values
- Testing type annotations

## Scoring Algorithm

**MANDATORY READ:** Load `shared/references/audit_worker_core_contract.md` and `shared/references/audit_scoring.md`.

**Severity mapping by Usefulness Score:**
- Score <5 -> CRITICAL (test wastes significant maintenance effort)
- Score 5-9 -> HIGH (test likely wasteful)
- Score 10-14 -> MEDIUM (review needed)
- Score >=15 -> no issue (KEEP)

## Output Format

**MANDATORY READ:** Load `shared/references/audit_worker_core_contract.md` and `shared/templates/audit_worker_report_template.md`.

If summaryArtifactPath is present, write JSON summary per shared/references/audit_summary_contract.md. Compact text output is fallback only.

Write report to `{output_dir}/633-test-value.md` with `category: "Risk-Based Value"` and checks: usefulness_score, remove_candidates, review_candidates.

Return summary per `shared/references/audit_summary_contract.md`.

Legacy compact text output is allowed only when `summaryArtifactPath` is absent:
```
Report written: .hex-skills/runtime-artifacts/runs/{run_id}/audit-report/633-test-value.md
Score: X.X/10 | Issues: N (C:N H:N M:N L:N)
```

**Note:** Tests with Usefulness Score >=15 (KEEP) are NOT included in findings -- only issues are reported.

## Critical Rules

**MANDATORY READ:** Load `shared/references/audit_worker_core_contract.md`.

- **Do not auto-fix:** Report only
- **Effort realism:** S = <1h, M = 1-4h, L = >4h
- **Score objectivity:** Base Impact and Probability on code analysis, not assumptions
- **KEEP tests not reported:** Only REVIEW and REMOVE decisions appear in findings
- **Cross-reference E2E:** REVIEW decisions depend on whether E2E already covers the scenario

## Definition of Done

**MANDATORY READ:** Load `shared/references/audit_worker_core_contract.md`.

- [ ] contextStore parsed successfully (including output_dir)
- [ ] Usefulness Score calculated for each test (Impact x Probability)
- [ ] Decisions classified: KEEP (>=15), REVIEW (10-14), REMOVE (<10)
- [ ] Findings collected with severity, location, effort, recommendation
- [ ] Score calculated using penalty algorithm
- [ ] Report written to `{output_dir}/633-test-value.md` (atomic single Write call)
- [ ] Summary written per contract

## Reference Files

- **Audit output schema:** `shared/references/audit_output_schema.md`

---
**Version:** 3.0.0
**Last Updated:** 2025-12-23

Related Skills

ln-782-test-runner

310

from levnikolaevich/claude-code-skills

Executes all test suites and reports results with coverage. Use when verifying that test infrastructure works after bootstrap.

ln-743-test-infrastructure

310

from levnikolaevich/claude-code-skills

Sets up test infrastructure with Vitest, xUnit, and pytest. Use when adding testing frameworks and sample tests to a project.

ln-654-resource-lifecycle-auditor

310

from levnikolaevich/claude-code-skills

Checks session scope mismatch, missing cleanup, pool config, error path leaks, resource holding. Use when auditing resource lifecycle.

ln-653-runtime-performance-auditor

310

from levnikolaevich/claude-code-skills

Checks blocking IO in async, unnecessary allocations, sync sleep, string concat in loops, redundant copies. Use when auditing runtime performance.

ln-652-transaction-correctness-auditor

310

from levnikolaevich/claude-code-skills

Checks transaction scope, missing rollback handling, long-held transactions, trigger/notify interaction. Use when auditing transaction correctness.

ln-651-query-efficiency-auditor

310

from levnikolaevich/claude-code-skills

Checks redundant fetches, N+1 loops, over-fetching, missing bulk operations, wrong caching scope. Use when auditing query efficiency.

ln-650-persistence-performance-auditor

310

from levnikolaevich/claude-code-skills

Coordinates persistence and performance audit across queries, transactions, runtime, and resource lifecycle. Use when auditing data layer performance.

ln-647-env-config-auditor

310

from levnikolaevich/claude-code-skills

Checks env var config sync, missing defaults, naming conventions, startup validation. Use when auditing environment configuration.

ln-646-project-structure-auditor

310

from levnikolaevich/claude-code-skills

Checks file hygiene, ignore files, framework conventions, domain/layer organization, naming. Use when auditing project structure.

ln-644-dependency-graph-auditor

310

from levnikolaevich/claude-code-skills

Builds dependency graph, detects cycles, validates boundary rules, calculates coupling metrics (Ca/Ce/I). Use when auditing dependency structure.

ln-643-api-contract-auditor

310

from levnikolaevich/claude-code-skills

Checks layer leakage in method signatures, missing DTOs, entity leakage to API, inconsistent error contracts. Use when auditing API contracts.

ln-642-layer-boundary-auditor

310

from levnikolaevich/claude-code-skills

Checks layer boundary violations, transaction boundaries, session ownership, cross-layer consistency. Use when auditing architecture layers.