adversary

On-demand adversarial quality reviews using strategy templates. Selects strategies by criticality level, executes adversarial templates against deliverables, and scores quality using LLM-as-Judge rubric. Integrates with quality-enforcement.md SSOT.

16 stars

bygeekatron

View on GitHub Installation ↓

Best use case

adversary is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using adversary should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/adversary/SKILL.md --create-dirs "https://raw.githubusercontent.com/geekatron/jerry/main/skills/adversary/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/adversary/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How adversary Compares

Feature / Agent	adversary	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Adversary Skill

> **Version:** 1.0.0
> **Framework:** Jerry Adversarial Quality (ADV)
> **Constitutional Compliance:** Jerry Constitution v1.0
> **SSOT Reference:** `.context/rules/quality-enforcement.md`

## Document Audience (Triple-Lens)

This SKILL.md serves multiple audiences:

| Level | Audience | Sections to Focus On |
|-------|----------|---------------------|
| **L0 (ELI5)** | New users, stakeholders | [Purpose](#purpose), [When to Use](#when-to-use-this-skill), [Routing Disambiguation](#routing-disambiguation), [When to Use /adversary vs ps-critic](#when-to-use-adversary-vs-ps-critic), [Quick Reference](#quick-reference) |
| **L1 (Engineer)** | Developers invoking agents | [Invoking an Agent](#invoking-an-agent), [Available Agents](#available-agents), [Dependencies](#dependencies--prerequisites), [Adversarial Quality Mode](#adversarial-quality-mode) |
| **L2 (Architect)** | Workflow designers | [P-003 Compliance](#p-003-compliance), [H-14 Integration](#integration-with-creator-critic-revision-cycle-h-14), [Constitutional Compliance](#constitutional-compliance), [Strategy Templates](#strategy-templates) |

---

## Purpose

The Adversary skill provides **on-demand adversarial quality reviews** using strategy templates from the Jerry quality framework. Unlike the problem-solving skill's integrated adversarial mode (which operates within creator-critic loops), the adversary skill is invoked explicitly when you need a standalone adversarial assessment of any deliverable.

### Key Capabilities

- **Strategy Selection** - Maps criticality levels (C1-C4) to the correct adversarial strategy sets per SSOT
- **Strategy Execution** - Loads and runs strategy templates against deliverables with structured finding classification
- **Quality Scoring** - Implements S-014 LLM-as-Judge rubric scoring with 6-dimension weighted composite
- **Criticality-Aware** - Automatically adjusts review depth based on deliverable criticality
- **Template-Driven** - All strategies follow standardized templates from `.context/templates/adversarial/`

---

## When to Use This Skill

Activate when:

- Applying adversarial strategies to a completed deliverable outside a creator-critic loop
- Scoring deliverable quality using the SSOT 6-dimension rubric
- Selecting the appropriate strategy set for a given criticality level
- Running a full C4 tournament review with all 10 strategies
- Pairing S-003 (Steelman) before S-002 (Devil's Advocate) per H-16
- Needing a standalone quality assessment without revision cycles

NEVER invoke this skill when:
- Task requires iterative creator-critic-revision loop -- Consequence: Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; use `/problem-solving` with ps-critic instead
- Task is routine code review for quick defect checks -- Consequence: Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading; use ps-reviewer instead
- Task is binary constraint validation (pass/fail compliance) -- Consequence: Adversarial strategies assess quality dimensions, not binary constraint compliance; traceability matrices not generated; use ps-validator instead
- Work is routine code changes at C1 criticality -- Consequence: Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work; use self-review (S-010) only
- Defects or bugs have obvious solutions -- Consequence: Adversarial quality assessment evaluates existing deliverables, not diagnose root causes; use `/problem-solving` for root-cause analysis instead
- User explicitly requests a quick review without adversarial rigor -- Consequence: Overriding user preference violates P-020 (user authority); respect the request

See [Routing Disambiguation](#routing-disambiguation) for full exclusion conditions with consequences.

> **Note:** Use `/adversary` for adversarial code review (e.g., red team security review, tournament quality assessment of code artifacts). Use `ps-reviewer` for routine defect detection.

### Relationship to ps-critic

The adv-scorer and ps-critic agents share the same S-014 LLM-as-Judge rubric and 6-dimension weighted composite scoring methodology. They serve different workflow positions:

| Aspect | adv-scorer | ps-critic |
|--------|-----------|-----------|
| **Workflow** | Standalone/on-demand scoring | Embedded in creator-critic-revision loops |
| **Output** | Focused score report with L0 summary | L0/L1/L2 multi-level critique report |
| **Iteration** | May be invoked once or re-invoked for re-scoring | Iterates within the H-14 cycle |
| **Invocation** | Via `/adversary` skill | Via `/problem-solving` skill |

Both agents produce comparable scores from the same rubric; the choice depends on whether you need standalone assessment (adv-scorer) or iterative critique with revision guidance (ps-critic).

---

## Available Agents

| Agent | Role | Model | Output Location |
|-------|------|-------|-----------------|
| `adv-selector` | Strategy Selector - Maps criticality to strategy sets | haiku | Strategy selection plan |
| `adv-executor` | Strategy Executor - Runs strategy templates against deliverables | sonnet | Strategy execution reports |
| `adv-scorer` | Quality Scorer - LLM-as-Judge rubric scoring | sonnet | Quality score reports |

---

## P-003 Compliance

All adversary agents are **workers**, NOT orchestrators. The MAIN CONTEXT (Claude session) orchestrates the workflow.

```
P-003 AGENT HIERARCHY:
======================

  +-------------------+
  | MAIN CONTEXT      |  <-- Orchestrator (Claude session)
  | (orchestrator)    |
  +-------------------+
     |        |        |
     v        v        v
  +------+ +------+ +------+
  | adv- | | adv- | | adv- |   <-- Workers (max 1 level)
  |select| |exec  | |scorer|
  +------+ +------+ +------+

  Agents CANNOT invoke other agents.
  Agents CANNOT spawn subagents.
  Only MAIN CONTEXT orchestrates the sequence.
```

---

## Invoking an Agent

### Option 1: Natural Language Request

Simply describe what you need:

```
"Run an adversarial review of this ADR at C3 criticality"
"Score this deliverable with LLM-as-Judge"
"What strategies should I apply for a C2 review?"
"Run Devil's Advocate and Steelman on this design document"
"Execute a full C4 tournament review on the architecture proposal"
```

The orchestrator will select the appropriate agent(s) based on keywords and context.

### Option 2: Explicit Agent Request

Request a specific agent:

```
"Use adv-selector to pick strategies for C3 criticality"
"Have adv-executor run S-002 Devil's Advocate on the ADR"
"I need adv-scorer to produce a quality score for this synthesis"
```

### Option 3: Task Tool Invocation

For programmatic invocation within workflows:

```python
Task(
    description="adv-selector: Strategy selection for C3",
    subagent_type="general-purpose",
    prompt="""
You are the adv-selector agent (v1.0.0).

## ADV CONTEXT (REQUIRED)
- **Criticality Level:** C3
- **Deliverable Type:** Architecture Decision Record
- **Deliverable Path:** docs/decisions/adr-042-persistence.md

## MANDATORY PERSISTENCE (P-002)
Create file at: {output_path}

## TASK
Select the strategy set for C3 criticality per SSOT.
"""
)
```

---

## Dependencies / Prerequisites

The adversary skill depends on external artifacts created by other enablers. These MUST be in place before the skill is fully operational.

### Strategy Template Files

All 10 strategy templates in `.context/templates/adversarial/` are created by separate enablers:

| Template | Source Enabler | Status |
|----------|---------------|--------|
| `s-001-red-team.md` | EN-809 | Created by EN-809 |
| `s-002-devils-advocate.md` | EN-806 | Created by EN-806 |
| `s-003-steelman.md` | EN-807 | Created by EN-807 |
| `s-004-pre-mortem.md` | EN-808 | Created by EN-808 |
| `s-007-constitutional-ai.md` | EN-805 | Created by EN-805 |
| `s-010-self-refine.md` | EN-804 | Created by EN-804 |
| `s-011-cove.md` | EN-809 | Created by EN-809 |
| `s-012-fmea.md` | EN-808 | Created by EN-808 |
| `s-013-inversion.md` | EN-808 | Created by EN-808 |
| `s-014-llm-as-judge.md` | EN-803 | Created by EN-803 |

**Naming Convention:** Templates follow the pattern `s-{NNN}-{slug}.md` where `{NNN}` is the strategy ID from the quality-enforcement SSOT and `{slug}` is a hyphenated descriptor (e.g., `s-002-devils-advocate.md`). These are static reference documents versioned alongside the codebase — they are not dynamically generated.

**Fallback behavior:** If a template file is not found, adv-executor MUST:
1. Emit a WARNING to the orchestrator with the missing template path
2. Request the corrected path from the orchestrator
3. Do NOT silently skip the strategy — the orchestrator decides whether to skip or provide an alternative

The skill skeleton (EN-802) defines the structure; the template enablers populate the content.

### SSOT File

- `.context/rules/quality-enforcement.md` -- MUST exist. All thresholds, strategy IDs, criticality levels, and quality dimensions are sourced from here.

---

## Adversarial Quality Mode

> **SSOT Reference:** `.context/rules/quality-enforcement.md` -- all thresholds, strategy IDs, criticality levels, and quality dimensions are defined there. NEVER hardcode values; always reference the SSOT.

### Strategy Catalog

The quality framework provides 10 selected adversarial strategies across 4 mechanistic families. See `.context/rules/quality-enforcement.md` (Strategy Catalog section) for the authoritative list.

| Family | Strategies | Adversary Application |
|--------|-----------|----------------------|
| **Iterative Self-Correction** | S-014 (LLM-as-Judge), S-007 (Constitutional AI Critique), S-010 (Self-Refine) | Quality scoring, constitutional compliance, self-review |
| **Dialectical Synthesis** | S-003 (Steelman Technique) | Strengthen arguments before critique (H-16 REQUIRED) |
| **Role-Based Adversarialism** | S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis) | Challenge assumptions, anticipate failures, adversarial exploration |
| **Structured Decomposition** | S-013 (Inversion Technique), S-012 (FMEA), S-011 (Chain-of-Verification) | Systematic failure mode analysis, verification chains |

### Strategy Templates

All strategies use standardized templates from `.context/templates/adversarial/`:

| Template | Strategy | Purpose |
|----------|----------|---------|
| `s-001-red-team.md` | S-001 Red Team Analysis | Adversarial exploration of attack surfaces |
| `s-002-devils-advocate.md` | S-002 Devil's Advocate | Challenge assumptions and key claims |
| `s-003-steelman.md` | S-003 Steelman Technique | Strengthen the best version of the argument |
| `s-004-pre-mortem.md` | S-004 Pre-Mortem Analysis | Anticipate failure modes |
| `s-007-constitutional-ai.md` | S-007 Constitutional AI Critique | Constitutional compliance verification |
| `s-010-self-refine.md` | S-010 Self-Refine | Iterative self-improvement |
| `s-011-cove.md` | S-011 Chain-of-Verification | Systematic claim verification |
| `s-012-fmea.md` | S-012 FMEA | Failure Mode and Effects Analysis |
| `s-013-inversion.md` | S-013 Inversion Technique | Invert key claims to find blind spots |
| `s-014-llm-as-judge.md` | S-014 LLM-as-Judge | Rubric-based quality scoring |

### Criticality-Based Strategy Selection

Per SSOT, strategy activation follows criticality levels:

| Level | Required Strategies | Optional Strategies |
|-------|---------------------|---------------------|
| **C1 (Routine)** | S-010 | S-003, S-014 |
| **C2 (Standard)** | S-007, S-002, S-014 | S-003, S-010 |
| **C3 (Significant)** | C2 + S-004, S-012, S-013 | S-001, S-003, S-010, S-011 |
| **C4 (Critical)** | All 10 selected strategies | None (all required) |

### H-16 Ordering Constraint

**HARD rule:** S-003 (Steelman) MUST be applied before S-002 (Devil's Advocate). Always strengthen the argument before challenging it.

### Quality Scoring (S-014)

The SSOT defines 6 quality dimensions with weights:

| Dimension | Weight |
|-----------|--------|
| Completeness | 0.20 |
| Internal Consistency | 0.20 |
| Methodological Rigor | 0.20 |
| Evidence Quality | 0.15 |
| Actionability | 0.15 |
| Traceability | 0.10 |

**Threshold:** >= 0.92 weighted composite for C2+ deliverables (H-13)

**Leniency bias counteraction:** Score strictly against rubric criteria. When uncertain between adjacent scores, choose the lower one.

---

## When to Use /adversary vs ps-critic

Both `/adversary` and `ps-critic` apply adversarial strategies, but serve different workflow positions:

| Aspect | /adversary Skill | ps-critic Agent |
|--------|-----------------|----------------|
| **Use Case** | Standalone adversarial reviews, tournament scoring, strategy template execution | Embedded quality critique within creator-critic-revision loops |
| **Invocation** | Explicit on-demand (`/adversary` or natural language request) | Invoked by orchestrator within H-14 cycle |
| **Output Focus** | Strategy-specific findings (adv-executor) + quality score (adv-scorer) | L0/L1/L2 multi-level critique with dimension-level improvement guidance |
| **Iteration** | May be used once or re-invoked for re-scoring after revision | Iterates within the H-14 minimum 3-iteration cycle |
| **Strategy Coverage** | Full strategy set per criticality (C1-C4), including tournament mode (all 10) | Applies strategies appropriate to criticality, embedded in workflow |
| **Agents** | adv-selector, adv-executor, adv-scorer | ps-critic (single agent) |
| **Output Artifacts** | Strategy execution reports + quality score report | Critique report with improvement recommendations |

**When to Use Each:**

- **Use `/adversary` when:**
  - You need a formal adversarial review outside a creator-critic loop
  - You need tournament mode scoring with all 10 strategies (C4)
  - You need strategy-specific finding reports (adv-executor outputs)
  - You need standalone quality scoring with S-014 rubric
  - You need to apply a specific strategy template (e.g., "run S-002 Devil's Advocate on this ADR")

- **Use `ps-critic` when:**
  - You are within an orchestrated creator-critic-revision workflow
  - You need iterative improvement guidance across multiple revision cycles
  - You need lightweight iteration-level critique without full tournament overhead
  - You are working within `/problem-solving` or `/orchestration` skills

**Complementary Use:**

Both can work together:
- `ps-critic` applies strategies within workflows for embedded quality cycles
- `/adversary` orchestrates cross-strategy tournament reviews for C4 critical deliverables
- `ps-critic` uses the same S-014 rubric and dimension scoring as adv-scorer for consistency

---

## Tournament Mode

Tournament mode executes all 10 adversarial strategies against a C4 (Critical) deliverable in a deterministic sequence. This is the most comprehensive review level, required for irreversible decisions.

### Execution Order

All 10 strategies run in the recommended order from `skills/adversary/agents/adv-selector.md`:

- **Group A — Self-Review**: S-010 (Self-Refine)
- **Group B — Strengthen**: S-003 (Steelman Technique)
- **Group C — Challenge**: S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis)
- **Group D — Verify**: S-007 (Constitutional AI Critique), S-011 (Chain-of-Verification)
- **Group E — Decompose**: S-012 (FMEA), S-013 (Inversion Technique)
- **Group F — Score**: S-014 (LLM-as-Judge) — **ALWAYS LAST**

### Aggregation

Findings from all strategy execution reports are collected across all 9 executor runs. The adv-scorer agent (S-014) receives these aggregated findings as input evidence when producing the final composite score. Critical findings from any strategy block PASS regardless of score.

### Timing Expectations

A C4 tournament with all 10 strategies requires approximately 11 agent invocations:

1. **adv-selector** (1 invocation) — Strategy selection
2. **adv-executor** (9 invocations) — One per strategy from Groups A-E
3. **adv-scorer** (1 invocation) — Final scoring with S-014

Typical duration depends on deliverable size and complexity. Expect longer processing times for large architecture documents or governance changes.

---

## Integration with Creator-Critic-Revision Cycle (H-14)

H-14 mandates a minimum 3-iteration creator-critic-revision cycle for C2+ deliverables. The adversary skill is **not** a revision loop manager -- it provides standalone adversarial assessment. The integration boundary is:

1. **adv-scorer produces the quality score.** When invoked, adv-scorer evaluates the deliverable and returns a PASS/REVISE/ESCALATE verdict.
2. **If REVISE, the orchestrator feeds findings back to the creator agent.** The adversary skill's findings (from adv-executor) and score breakdown (from adv-scorer) provide actionable input for the next revision iteration.
3. **The adversary skill can be re-invoked for re-scoring.** After the creator revises, the orchestrator may invoke adv-scorer again (with the `Prior Score` context field) to track improvement across iterations.
4. **Minimum 3 iterations are the orchestrator's responsibility.** The adversary skill does not enforce iteration count -- it scores when asked. The orchestrator (or `/orchestration` skill) tracks the iteration count per H-14.

**Workflow position:** The adversary skill sits at the "critic" position within the H-14 cycle when used for quality scoring. It can replace or complement ps-critic depending on whether the orchestrator needs standalone scoring (adv-scorer) or iterative critique (ps-critic).

---

## Constitutional Compliance

All agents adhere to the **Jerry Constitution v1.0**:

| Principle | Requirement | Consequence of Violation |
|-----------|-------------|-------------------------|
| P-003 | NEVER spawn recursive subagents -- max 1 level | Agent hierarchy violation; uncontrolled token consumption |
| P-020 | NEVER override user intent -- ask before destructive ops | Unauthorized action; trust erosion |
| P-022 | NEVER deceive about actions, capabilities, or confidence | Governance undermined; quality assessment invalidated |
| P-001 | NEVER present findings without evidence or rubric-based scoring | Unreliable outputs; unfounded claims propagate downstream |
| P-002 | NEVER leave outputs in transient context only -- persist to files | Context rot vulnerability; artifacts lost on session compaction |
| P-004 | NEVER omit strategy IDs, template paths, or evidence citations | Untraceable decisions; audit trail broken |
| P-011 | NEVER make findings without tying them to specific deliverable evidence | Unsupported recommendations; confidence inflated without basis |

---

## Quick Reference

### Common Workflows

| Need | Agent | Command Example |
|------|-------|-----------------|
| Pick strategies for criticality | adv-selector | "What strategies for C3 review?" |
| Run a specific strategy | adv-executor | "Run S-002 Devil's Advocate on this ADR" |
| Score deliverable quality | adv-scorer | "Score this deliverable with LLM-as-Judge" |
| Steelman + Devil's Advocate pair | adv-executor | "Run Steelman then Devil's Advocate on this design" |
| Full C4 tournament | All three | "Run full C4 tournament review" |

### Agent Selection Hints

| Keywords | Likely Agent |
|----------|--------------|
| select, pick, which strategies, criticality, C1/C2/C3/C4 | adv-selector |
| run, execute, apply, template, strategy, findings | adv-executor |
| score, judge, rubric, dimensions, threshold, 0.92 | adv-scorer |

---

## Routing Disambiguation

> When this skill is the wrong choice and what happens if misrouted.

| Condition | Use Instead | Consequence of Misrouting |
|-----------|-------------|--------------------------|
| Iterative creator-critic-revision loop needed | `/problem-solving` (ps-critic) | Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; ps-critic operates within H-14 revision cycles while /adversary produces standalone assessments |
| Routine code review for quick defect checks | `/problem-solving` (ps-reviewer) | Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading |
| Constraint validation (pass/fail compliance) | `/problem-solving` (ps-validator) | Adversarial strategies assess quality dimensions, not binary constraint compliance; ps-validator produces traceability matrices while /adversary produces quality scores |
| Research, analysis, or root cause investigation | `/problem-solving` (ps-researcher or ps-investigator) | Adversarial agents evaluate existing deliverables, not produce new analysis; no research methodology or causal investigation capability |
| Security-hardened software design or threat modeling | `/eng-team` | /adversary applies quality assessment strategies (S-001 Red Team Analysis is quality-focused); /eng-team provides STRIDE/DREAD threat modeling and OWASP compliance |
| Offensive security testing or penetration testing | `/red-team` | /adversary "red team" keyword refers to S-001 quality strategy; /red-team provides MITRE ATT&CK kill chain methodology for authorized penetration testing |
| C1 routine work with obvious solutions | Self-review (S-010) only | Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work |

---

## References

| Source | Content |
|--------|---------|
| `.context/rules/quality-enforcement.md` | SSOT for thresholds, strategies, criticality levels |
| `.context/templates/adversarial/` | Strategy execution templates |
| `skills/problem-solving/SKILL.md` | Integrated adversarial quality mode (ps-critic) |
| `docs/governance/JERRY_CONSTITUTION.md` | Constitutional principles |
| ADR-EPIC002-001 | Strategy selection and composite scores |
| ADR-EPIC002-002 | 5-layer enforcement architecture |

---

*Skill Version: 1.0.0*
*Constitutional Compliance: Jerry Constitution v1.0*
*SSOT: `.context/rules/quality-enforcement.md`*
*Created: 2026-02-15*

Related Skills

ux-lean-ux

from geekatron/jerry

Lean UX hypothesis-driven design sub-skill for the /user-experience parent skill. Facilitates Build-Measure-Learn cycles using Jeff Gothelf and Josh Seiden's Lean UX methodology (3rd ed. 2021). Produces hypothesis backlogs, assumption maps, MVP experiment designs, and validated learning logs. Invoke when teams need hypothesis-driven iteration, assumption mapping, experiment design, or validated learning documentation. Invoked by ux-orchestrator during Wave 2 lifecycle-stage routing or when user intent is "testing hypotheses" during the "During design" stage. Triggers: lean UX, hypothesis, assumption mapping, build-measure-learn, MVP experiment, validated learning, experiment design, hypothesis backlog.

ux-kano-model

from geekatron/jerry

Kano model feature classification and prioritization sub-skill for the /user-experience parent skill. Classifies product features into Must-be (M), Performance (O), Attractive (A), Indifferent (I), and Reverse (R) categories using the functional/dysfunctional questionnaire pair methodology (Kano et al., 1984). Computes Customer Satisfaction (CS) coefficients (Better/Worse) for priority matrix visualization. Produces feature classification reports, priority matrices, and survey design templates. Sample size awareness: 5-8 respondents yields directional classification only (MEDIUM confidence); 20+ respondents required for statistical classification (Berger et al., 1993). Invoked by ux-orchestrator during Wave 4 lifecycle-stage routing or when user intent is "Need to prioritize features" at any lifecycle stage. Triggers: Kano, must-be, attractive, one-dimensional, performance feature, satisfaction, feature classification, delighter, feature prioritization, CS coefficient.

ux-jtbd

from geekatron/jerry

Jobs-to-Be-Done research and analysis sub-skill for the /user-experience parent skill. Conducts JTBD job statement synthesis, switch interview analysis (Moesta/Spiek four forces), outcome-driven innovation (Ulwick ODI), and job mapping for tiny teams (1-5 people). Invoked by ux-orchestrator when users need to understand user motivations, map jobs to be done, identify switch triggers, or produce job maps with outcome expectations. Sub-skill of /user-experience; routed via ux-orchestrator lifecycle-stage triage. Triggers: JTBD, jobs to be done, switch interview, job mapping, user motivation, outcome, hiring criteria, user jobs, switch forces.

ux-inclusive-design

from geekatron/jerry

Inclusive design and WCAG 2.2 accessibility evaluation sub-skill for the /user-experience parent skill. Performs WCAG 2.2 compliance audits across Perceivable, Operable, Understandable, and Robust principles (conformance levels A, AA, AAA) and applies Microsoft Inclusive Design methodology including Persona Spectrum analysis (permanent, temporary, situational disabilities). Produces accessibility audit reports and persona spectrum analyses. Invoke when teams need accessibility compliance evaluation, WCAG conformance auditing, screen reader compatibility assessment, color contrast analysis, cognitive load evaluation, or inclusive design review. Invoked by ux-orchestrator during Wave 3 lifecycle-stage routing or when user intent is "Check accessibility" at any lifecycle stage. Triggers: accessibility, WCAG, ARIA, screen reader, contrast, cognitive load, inclusive, a11y, inclusive design, WCAG 2.2, persona spectrum.

ux-heuristic-eval

from geekatron/jerry

Nielsen heuristic evaluation sub-skill for the /user-experience parent skill. Evaluates interfaces against Nielsen's 10 usability heuristics, produces severity-rated findings on a 0-4 scale (Cosmetic to Catastrophic), and generates remediation recommendations with effort estimates. Invoke when teams need structured usability evaluation, interface review, heuristic audit, or severity-rated UX findings. Invoked by ux-orchestrator during Wave 1 lifecycle-stage routing or CRISIS mode triage. Triggers: heuristic evaluation, usability audit, Nielsen heuristics, interface review, severity rating, usability inspection, UX evaluation.

ux-heart-metrics

from geekatron/jerry

HEART Metrics framework sub-skill for the /user-experience parent skill. Applies Google's HEART framework (Happiness, Engagement, Adoption, Retention, Task Success) using the Goals-Signals-Metrics (GSM) process to define measurable UX metrics for products and features. Invoked by ux-orchestrator when users need to measure UX health, define UX metrics, establish measurement baselines, or produce dashboard-ready metric specifications. Sub-skill of /user-experience; routed via ux-orchestrator lifecycle-stage triage. Triggers: HEART, metrics, happiness, engagement, adoption, retention, task success, GSM, measurement, UX metrics, dashboard, goals signals metrics.

ux-design-sprint

from geekatron/jerry

AJ&Smart Design Sprint 2.0 facilitation sub-skill for the /user-experience parent skill. Facilitates a structured four-day rapid prototyping and validation process compressed from the Google Ventures five-day Design Sprint (Knapp, Zeratsky & Kowitz, 2016; Courtney, 2019). Produces sprint artifacts including challenge maps, solution sketches, storyboards, realistic prototypes, and structured user interview findings with synthesis confidence gates. Invoke when teams need to rapidly validate a product concept, solve a critical design challenge through structured prototyping, test ideas with real users before committing to development, or explore solution directions when they do not know what to build. Triggers: design sprint, GV sprint, rapid prototyping, sprint week, map sketch decide test, 4-day sprint, design sprint 2.0, AJ Smart sprint, validate prototype, test with users, sprint facilitation.

ux-behavior-design

from geekatron/jerry

Fogg Behavior Model B=MAP bottleneck diagnosis sub-skill for the /user-experience parent skill. Diagnoses why users fail to take desired actions by analyzing the three B=MAP factors (Motivation, Ability, Prompt) and identifying which factor falls below the action threshold. Produces bottleneck diagnoses, factor-level assessments, and intervention recommendations with synthesis confidence gates. Invoke when teams need to understand why users are not completing a specific action, diagnose behavioral bottlenecks, design behavior change interventions, or analyze post-launch user inaction patterns. Invoked by ux-orchestrator during Wave 4 lifecycle-stage routing or when user intent is "Users not completing action" during the "After launch" stage. Triggers: behavior design, B=MAP, Fogg model, behavior bottleneck, motivation analysis, ability analysis, prompt design, why users don't, user inaction, behavior diagnosis, tiny habits, action threshold.

ux-atomic-design

from geekatron/jerry

Atomic Design component taxonomy sub-skill for the /user-experience parent skill. Implements Brad Frost's 5-level component hierarchy (Atoms, Molecules, Organisms, Templates, Pages) for design system architecture. Produces component inventories, design token audits, composition rules, and Storybook coverage reports. Invoke when teams need component taxonomy construction, design system architecture, Storybook integration, design token consistency analysis, or component reuse auditing. Invoked by ux-orchestrator during Wave 3 lifecycle-stage routing or when user intent is "Building component system" during the "During design" stage. Triggers: atomic design, component taxonomy, design tokens, Storybook, atoms molecules organisms, design system, component inventory, component library.

ux-ai-first-design

from geekatron/jerry

AI-first interaction design sub-skill (CONDITIONAL) for the /user-experience parent skill. Provides trust-calibrated AI interaction design guidance using Yang et al.'s trust-risk and error-risk classification framework. Produces interaction pattern recommendations, trust calibration assessments, feedback loop designs, and progressive disclosure strategies for AI-powered features. CONDITIONAL: requires WSM >= 7.80 AND enabler research (FEAT-020) complete; otherwise routes to /ux-heuristic-eval with PAIR protocol. Invoke when teams need to design AI-powered interactions, calibrate user trust in AI outputs, classify AI error risks, design human-AI handoff patterns, or audit existing AI interfaces for trust and safety. Triggers: AI-first design, AI interaction, trust calibration, AI UX, conversational UX, AI interface, LLM interface, agentic UX, human-AI interaction, AI transparency, AI error handling, AI onboarding, progressive AI disclosure, trust-risk, error-risk.

user-experience

from geekatron/jerry

Parent orchestrator for AI-augmented UX methodology targeting tiny teams (1-5 people). Routes to 10 sub-skills by product lifecycle stage through criteria-gated waves. Invoke when team needs structured UX evaluation, user research, design systems, UX metrics, behavior diagnosis, feature prioritization, design sprints, or AI interaction design. Each sub-skill implements a proven UX framework with synthesis hypothesis confidence gates and MCP design tool integration. Triggers: UX, user experience, usability, heuristic evaluation, JTBD, lean UX, HEART metrics, atomic design, inclusive design, behavior design, Kano model, design sprint, AI-first design, UX audit, accessibility, design system, user research.

use-case

from geekatron/jerry

Guided use case authoring and decomposition using Cockburn's 12-step writing process and Jacobson UC 2.0 progressive narrative levels. Creates structured use case artifacts with YAML frontmatter validated against use-case-realization-v1.schema.json. Decomposes use cases into implementation-ready slices with INVEST criteria verification and produces realization interaction sequences for downstream /test-spec and /contract-design consumption. Invoke when writing, creating, authoring, elaborating, slicing, decomposing, or realizing use cases.