adversarial-reviewer
Use this skill for non-trivial code review where the job is to challenge the current artifact set, surface material defects, and produce an explicit change agenda. Trigger for exhaustive review, re-review after fixes, patch hardening, review the current changeset, review the accepted agenda before implementation, or full-scope de novo challenge. Use full-scope review rather than diff-only review when exhaustive confidence matters. Do not trigger for trivial wording feedback or final readiness decisions without a review question.
Best use case
adversarial-reviewer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use this skill for non-trivial code review where the job is to challenge the current artifact set, surface material defects, and produce an explicit change agenda. Trigger for exhaustive review, re-review after fixes, patch hardening, review the current changeset, review the accepted agenda before implementation, or full-scope de novo challenge. Use full-scope review rather than diff-only review when exhaustive confidence matters. Do not trigger for trivial wording feedback or final readiness decisions without a review question.
Teams using adversarial-reviewer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/adversarial-reviewer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How adversarial-reviewer Compares
| Feature / Agent | adversarial-reviewer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use this skill for non-trivial code review where the job is to challenge the current artifact set, surface material defects, and produce an explicit change agenda. Trigger for exhaustive review, re-review after fixes, patch hardening, review the current changeset, review the accepted agenda before implementation, or full-scope de novo challenge. Use full-scope review rather than diff-only review when exhaustive confidence matters. Do not trigger for trivial wording feedback or final readiness decisions without a review question.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Adversarial Reviewer This skill is the primary falsifier. It does not implement fixes. It makes the **next required changes explicit**. ## Output modes - **Standard**: full review contract. - **Fast**: minimal CLI surface for action. ## CLI-tail-weighted reporting - Keep dense evidence and ledgers above. - End with **Change Agenda**, **Fixed-Point Judgment**, and **Reviewer Bottom Line**. - In multi-item agendas, order from lower leverage to higher leverage so the strongest next move lands last. ## Core doctrine Operate in **FULL-SCOPE**, **EXHAUSTIVE**, **DE NOVO**, **ADVERSARIAL**, **SATURATING**, **MATERIAL**, **FIXED-POINT**, **PARSIMONIOUS**, **INVARIANT-GRADED**, **HAZARD-SEEKING**, **UNSOUND**, **WITNESS-BEARING**, **PRESERVATION-AWARE**, **PROGRESS-AWARE**, **TOTAL**, and **TRACEABLE** mode. ### Soundness pressure Do not flatten soundness to generic skepticism. Hunt for: - **unwitnessed guarantees** - **ill-typed states** and **illegal inhabitants** - **partial eliminators** - **broken preservation** - **stuck progress** - **incoherent** or **non-compositional** abstractions ### Complexity pressure Judge whether the current state reduces, preserves, or increases incidental complexity. Treat new branching, state, coupling, config surface, public surface, and cognitive burden as first-class review surfaces. ### Remediation posture split Do not let accretive scope discipline suppress material findings. When proposing remediation, prefer the narrowest truthful fix first. Escalate to structural remediation only when a narrower fix is genuinely insufficient. ## Required finding shape Every **material finding** must include: - `finding_id` - `recommended_change` - `evidence_of_defect` - `evidence_of_remedy` - `confidence` - `minimum_acceptable_fix` - `do_not_broaden_into` - `remediation_posture` ## Soundness Ledger For every material soundness issue, include: - `claim_id` - `claim_or_obligation` - `kind` - `witness_required` - `witness_status` - `preservation` - `progress` - `inhabitance` - `evidence` - `minimum_acceptable_fix` ## Output contract ### Standard Use concise sections in this order: - Review Basis - Material Findings - Soundness Ledger - Complexity Delta - Invariant Ledger - Foot-Gun Register - Non-Material Concerns - Verification Gaps - Residual Uncertainty - Change Agenda - Fixed-Point Judgment - Reviewer Bottom Line ### Fast Use concise sections in this order: - Material Findings - Soundness Ledger - Change Agenda - Reviewer Bottom Line ## Change Agenda rules - Make the change request explicit and scannable. - Each agenda item should fit this shape: - `change:` - `why:` - `evidence_of_defect:` - `evidence_of_remedy:` - `minimum_acceptable_fix:` - `posture:` - Put the single highest-value next move last. ## Hard rules - Never bury the recommended change only inside the rationale. - Never call a review complete while a material soundness gap remains unresolved. - Never let incidental style commentary crowd out material findings. - Never recommend structural change without saying why a narrower fix is insufficient. - Never promote a guess to a finding. ## Resources - [grading-rubrics.md](references/grading-rubrics.md) - [example-invocations.md](references/example-invocations.md) - [common-soundness.md](references/common-soundness.md) - [common-ledgers.md](references/common-ledgers.md) - [common-cli-reporting.md](references/common-cli-reporting.md)
Related Skills
zig
Use when implementing, reviewing, migrating, linting, testing, fuzzing, profiling, optimizing, or hardening Zig 0.16.0 code: .zig files, build.zig/build.zig.zon, std.Io/process.Init migration, C interop, expert comptime/metaprogramming/reflection/codegen, allocator ownership, FFI boundaries, concurrency, dependencies, and measured performance work.
verification-closure
Use this skill to decide whether the current artifact set is actually ready by consuming a canonical Closure Handoff Packet, running the narrowest decisive checks, and assigning a grounded readiness state. Trigger for requests like verify this patch is ready, run closure gates, decide if the branch reached a material fixed point, or close the loop on the current artifact state. Do not trigger for broad redesign or de novo code review without a closure question.
ux-audit
Systematic UX evaluation using Nielsen heuristics and accessibility checks. Use when reviewing UI, "is this usable", improving user experience, or pre-launch.
universalist
Use when code smells point to a structural refactor that should ship: flag or state matrices, repeated boundary validation, shared-key agreement checks, branchy policy logic, or syntax mixed with execution. Default to one seam, one smallest honest construction, adapter-first staging, and one proof signal.
synesthesia
Cross-modal diagnostic and review workflow for software systems. Use this skill to understand, explain, compare, critique, debug, profile, review, or refactor code by mapping technical signals into sensory models, then translating those models back into precise engineering language. Best fits include architecture review, readability or maintainability assessment, strange or flaky behavior, performance bottlenecks, API or UX critique, onboarding explanations, and comparing implementations or designs by feel, friction, weight, rhythm, sharpness, smoothness, coupling, or complexity. Also use when prompts ask what a codebase, bug, logs, API, or system feels, sounds, or looks like, or ask to make it lighter, smoother, cleaner, tighter, quieter, or more coherent. Do not use for exact API syntax, compliance or legal interpretation, security sign-off, rote code edits, or terse factual tasks.
st
Manage persistent task plans in `.step/st-plan.jsonl`, with an explicit first-use choice between repo-committed storage and local-only ignore via `.git/info/exclude`, so state survives turns/sessions and can stay reviewable in git when desired. Use when users ask to "use $st", "resume the plan", "export/import plan state", "checkpoint milestones", "track dependencies/blocked work", "show ready next tasks", "keep shared TODO status on disk", "store backlog tasks on disk without loading them into `update_plan` yet", "select which durable tasks enter the mirrored plan", "map a `$select` plan into durable execution state", "prove `$st` works for implementation tracking", mirror the durable plan into Codex `update_plan` or OpenCode `TodoWrite`, or diagnose/repair `st-plan.jsonl` concerns (for example append-only vs mutable semantics, lock-file ignore policy, or seq/checkpoint integrity).
simplify-and-refactor-code-isomorphically
Shrink and unify code without changing behavior. Use when: simplify, refactor, reduce duplication, remove lines, extract helper, reuse component, DRY, collapse, better abstraction.
ship
Finalize work after validation: confirm a signal, capture proof in the PR description, and open a PR (no merge). Use when asked to run `$ship`, ship/finalize a branch, prepare or open a PR without merging, or publish validation proof before handoff.
seq
Mine Codex sessions JSONL (`~/.codex/sessions`) and file-based memories (`~/.codex/memories`) for explicit `$seq` and artifact-forensics questions, preferring `artifact-search`, then specialized follow-ups such as `skill-blocks`, `plan-search`, `session-prompts`, `session-tooling`, and `orchestration-concurrency`, then `query-diagnose`, and generic `query` only when needed. Opencode mining is explicit-only and requires a literal `opencode` cue in the request.
review-adjudication
Discriminately adjudicate PR review comments before implementation. Treat each comment as a claim to test, build the strongest no-change countercase, recover PR rationale with explicit `$seq` when needed, and decide what to act on, rebut, defer, or investigate. Trigger for `$review-adjudication`, review the review, adjudicate PR comments, are these comments relevant, which comments matter, or should we act on these comments. Not for implementing fixes, writing rebuttals only, or final merge closure.
refine
Refine an existing Codex skill in place with minimal diffs, then validate with quick_validate. Trigger when asked to improve a skill's trigger description/frontmatter, workflow text, metadata, scripts/references/assets, or agents/openai.yaml; also for requests to iterate, refactor, rename, or fix a skill using usage/session-mining evidence (for example from $seq).
reduce
Deconstruct high-cost abstractions and recommend lower-level primitives that reduce tooling, indirection, and hidden steps. Use when requests ask for fewer layers (for example "too many layers", "remove this framework/plugin/DI", "ditch codegen/task runners", "replace with plain scripts/config/SQL"), or when prompts say "it feels over-engineered", "start simpler", or "reduce the size of the codebase" (analysis-only unless implementation is explicitly requested).