brutal-honesty-review

Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why.

Best use case

brutal-honesty-review is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why.

Teams using brutal-honesty-review should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/brutal-honesty-review/SKILL.md --create-dirs "https://raw.githubusercontent.com/proffesor-for-testing/agentic-qe/main/.claude/skills/brutal-honesty-review/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/brutal-honesty-review/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How brutal-honesty-review Compares

Feature / Agentbrutal-honesty-reviewStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Brutal Honesty Review

<default_to_action>
When brutal honesty is needed:
1. CHOOSE MODE: Linus (technical), Ramsay (standards), Bach (BS detection)
2. VERIFY CONTEXT: Senior engineer? Repeated mistake? Critical bug? Explicit request?
3. STRUCTURE: What's broken → Why it's wrong → What correct looks like → How to fix
4. ATTACK THE WORK, not the worker
5. ALWAYS provide actionable path forward

**Quick Mode Selection:**
- **Linus**: Code is technically wrong, inefficient, misunderstands fundamentals
- **Ramsay**: Quality is subpar compared to clear excellence model
- **Bach**: Certifications, best practices, or vendor hype need reality check

**Calibration:**
- Level 1 (Direct): "This approach is fundamentally flawed because..."
- Level 2 (Harsh): "We've discussed this three times. Why is it back?"
- Level 3 (Brutal): "This is negligent. You're exposing user data because..."

**DO NOT USE FOR:** Junior devs' first PRs, demoralized teams, public forums, low psychological safety

## Minimum Findings Enforcement
All brutal honesty reviews enforce a minimum of 3 weighted findings (CRITICAL=3, HIGH=2, MEDIUM=1, LOW=0.5). If the initial review finds fewer, escalate to deeper analysis. Brutally honest reviewers should ALWAYS find something -- if you can't, explain exactly why with evidence.
</default_to_action>

## Quick Reference Card

### When to Use
| Context | Appropriate? | Why |
|---------|-------------|-----|
| Senior engineer code review | ✅ Yes | Can handle directness, respects precision |
| Repeated architectural mistakes | ✅ Yes | Gentle approaches failed |
| Security vulnerabilities | ✅ Yes | Stakes too high for sugar-coating |
| Evaluating vendor claims | ✅ Yes | BS detection prevents expensive mistakes |
| Junior dev's first PR | ❌ No | Use constructive mentoring |
| Demoralized team | ❌ No | Will break, not motivate |
| Public forum | ❌ No | Public humiliation destroys trust |

### Three Modes

| Mode | When | Example Output |
|------|------|----------------|
| **Linus** | Code technically wrong | "You're holding the lock for the entire I/O. Did you test under load?" |
| **Ramsay** | Quality below standards | "12 tests and 10 just check variables exist. Where's the business logic?" |
| **Bach** | BS detection needed | "This cert tests memorization, not bug-finding. Who actually benefits?" |

---

## The Criticism Structure

```markdown
## What's Broken
[Surgical description - specific, technical]

## Why It's Wrong
[Technical explanation, not opinion]

## What Correct Looks Like
[Clear model of excellence]

## How to Fix It
[Actionable steps, specific to context]

## Why This Matters
[Impact if not fixed]
```

---

## Mode Examples

### Linus Mode: Technical Precision

```markdown
**Problem**: Holding database connection during HTTP call

"This is completely broken. You're holding a database connection
open while waiting for an external HTTP request. Under load, you'll
exhaust the connection pool in seconds.

Did you even test this with more than one concurrent user?

The correct approach is:
1. Fetch data from DB
2. Close connection
3. Make HTTP call
4. Open new connection if needed

This is Connection Management 101. Why wasn't this caught in review?"
```

### Ramsay Mode: Standards-Driven Quality

```markdown
**Problem**: Tests only verify happy path

"Look at this test suite. 15 tests, 14 happy path scenarios.
Where's the validation testing? Edge cases? Failure modes?

This is RAW. You're testing if code runs, not if it's correct.

Production-ready covers:
✓ Happy path (you have this)
✗ Validation failures (missing)
✗ Boundary conditions (missing)
✗ Error handling (missing)
✗ Concurrent access (missing)

You wouldn't ship code with 12% coverage. Don't merge tests
with 12% scenario coverage."
```

### Bach Mode: BS Detection

```markdown
**Problem**: ISTQB certification required for QE roles

"ISTQB tests if you memorized terminology, not if you can test software.

Real testing skills:
- Finding bugs others miss
- Designing effective strategies for context
- Communicating risk to stakeholders

ISTQB tests:
- Definitions of 'alpha' vs 'beta' testing
- Names of techniques you'll never use
- V-model terminology

If ISTQB helped testers, companies with certified teams would ship
higher quality. They don't."
```

---

## Assessment Rubrics

### Code Quality (Linus Mode)

| Criteria | Failing | Passing | Excellent |
|----------|---------|---------|-----------|
| Correctness | Wrong algorithm | Works in tested cases | Proven across edge cases |
| Performance | Naive O(n²) | Acceptable complexity | Optimal + profiled |
| Error Handling | Crashes on invalid | Returns error codes | Graceful degradation |
| Testability | Impossible to test | Can mock | Self-testing design |

### Test Quality (Ramsay Mode)

| Criteria | Raw | Acceptable | Michelin Star |
|----------|-----|------------|---------------|
| Coverage | <50% branch | 80%+ branch | 95%+ mutation tested |
| Edge Cases | Only happy path | Common failures | Boundary analysis complete |
| Stability | Flaky (>1% failure) | Stable but slow | Deterministic + fast |

### BS Detection (Bach Mode)

| Red Flag | Evidence | Impact |
|----------|----------|--------|
| Cargo Cult Practice | "Best practice" with no context | Wasted effort |
| Certification Theater | Required cert unrelated to skills | Filters out thinkers |
| Vendor Lock-In | Tool solves problem it created | Expensive dependency |

---

## Agent Integration

```typescript
// Brutal honesty code review
await Task("Code Review", {
  code: pullRequestDiff,
  mode: 'linus',  // or 'ramsay', 'bach'
  calibration: 'direct',  // or 'harsh', 'brutal'
  requireActionable: true
}, "qe-code-reviewer");

// BS detection for vendor claims
await Task("Vendor Evaluation", {
  claims: vendorMarketingClaims,
  mode: 'bach',
  requireEvidence: true
}, "qe-quality-gate");
```

---

## Agent Coordination Hints

### Memory Namespace
```
aqe/brutal-honesty/
├── code-reviews/*     - Technical review findings
├── bs-detection/*     - Vendor/cert evaluations
└── calibration/*      - Context-appropriate levels
```

### Fleet Coordination
```typescript
const reviewFleet = await FleetManager.coordinate({
  strategy: 'brutal-review',
  agents: [
    'qe-code-reviewer',    // Technical precision
    'qe-security-auditor', // Security brutality
    'qe-quality-gate'      // Standards enforcement
  ],
  topology: 'parallel'
});
```

---

## Related Skills
- [code-review-quality](../code-review-quality/) - Diplomatic version
- [context-driven-testing](../context-driven-testing/) - Foundation for Bach mode
- [sherlock-review](../sherlock-review/) - Evidence-based investigation

---

## Remember

**Brutal honesty eliminates ambiguity but has costs.** Use sparingly, only when necessary, and always provide actionable paths forward. Attack the work, never the worker.

**The Brutal Honesty Contract**: Get explicit consent. "I'm going to give unfiltered technical feedback. This will be direct, possibly harsh. The goal is clarity, not cruelty."

Related Skills

qe-sherlock-review

298
from proffesor-for-testing/agentic-qe

Evidence-based investigative code review using deductive reasoning to determine what actually happened versus what was claimed. Use when verifying implementation claims, investigating bugs, validating fixes, or conducting root cause analysis. Elementary approach to finding truth through systematic observation.

qe-pr-review

298
from proffesor-for-testing/agentic-qe

Scope-aware GitHub PR review with user-friendly tone and trust tier validation

qe-github-code-review

298
from proffesor-for-testing/agentic-qe

Comprehensive GitHub code review with AI-powered swarm coordination

qe-code-review-quality

298
from proffesor-for-testing/agentic-qe

Conduct context-driven code reviews focusing on quality, testability, and maintainability. Use when reviewing code, providing feedback, or establishing review practices.

qe-brutal-honesty-review

298
from proffesor-for-testing/agentic-qe

Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why.

sherlock-review

298
from proffesor-for-testing/agentic-qe

Evidence-based investigative code review using deductive reasoning to determine what actually happened versus what was claimed. Use when verifying implementation claims, investigating bugs, validating fixes, or conducting root cause analysis. Elementary approach to finding truth through systematic observation.

pr-review

298
from proffesor-for-testing/agentic-qe

Use when reviewing a GitHub PR for quality, scope correctness, trust tier compliance, or generating user-friendly review feedback.

github-code-review

298
from proffesor-for-testing/agentic-qe

Comprehensive GitHub code review with AI-powered swarm coordination

code-review-quality

298
from proffesor-for-testing/agentic-qe

Conduct context-driven code reviews focusing on quality, testability, and maintainability. Use when reviewing code, providing feedback, or establishing review practices.

qe-visual-testing-advanced

298
from proffesor-for-testing/agentic-qe

Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.

qe-verification-quality

298
from proffesor-for-testing/agentic-qe

Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.

qe-testability-scoring

298
from proffesor-for-testing/agentic-qe

AI-powered testability assessment using 10 principles of intrinsic testability with Playwright and optional Vibium integration. Evaluates web applications against Observability, Controllability, Algorithmic Simplicity, Transparency, Stability, Explainability, Unbugginess, Smallness, Decomposability, and Similarity. Use when assessing software testability, evaluating test readiness, identifying testability improvements, or generating testability reports.