codex-review
Two-pass adversarial review of design documents and implementation plans using OpenAI Codex CLI. Invokes Codex to review plans section-by-section (pass 1), then holistically (pass 2), feeding critique back for revision. Use when you have a design doc, architecture plan, or implementation plan that should be stress-tested before execution.
Best use case
codex-review is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Two-pass adversarial review of design documents and implementation plans using OpenAI Codex CLI. Invokes Codex to review plans section-by-section (pass 1), then holistically (pass 2), feeding critique back for revision. Use when you have a design doc, architecture plan, or implementation plan that should be stress-tested before execution.
Teams using codex-review should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/codex-review/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How codex-review Compares
| Feature / Agent | codex-review | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Two-pass adversarial review of design documents and implementation plans using OpenAI Codex CLI. Invokes Codex to review plans section-by-section (pass 1), then holistically (pass 2), feeding critique back for revision. Use when you have a design doc, architecture plan, or implementation plan that should be stress-tested before execution.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
SKILL.md Source
# Codex Adversarial Review Use this skill to get an independent adversarial review of design documents and implementation plans by invoking OpenAI Codex CLI as a reviewer. The review happens in two passes: 1. **Pass 1 — Section Review**: Each major section is reviewed independently for local issues (correctness, completeness, feasibility, internal consistency) 2. **Pass 2 — Holistic Review**: The full document is reviewed for cross-cutting concerns (contradictions between sections, missing integration points, architectural gaps, unaddressed failure modes) After each pass, Claude integrates the feedback and revises the plan before proceeding. ## Prerequisites - Python 3.10+ (no extra packages required — stdlib only) - `codex` CLI installed and authenticated (`npm i -g @openai/codex`) - Authentication configured (ChatGPT auth or `OPENAI_API_KEY` / `CODEX_API_KEY` set) - The plan/design document exists as a file (markdown preferred) All scripts are Python for cross-platform compatibility (Windows, macOS, Linux). ## When to Use - After drafting a design document or implementation plan - Before executing a plan with Claude Code's superpowers/execute-plan - When you want a second opinion on architecture decisions - When a plan is complex enough that internal inconsistencies are likely ## Workflow ### Step 0: Preparation Before invoking any review, ensure: 1. The plan document exists as a file. If the plan is only in conversation context, write it to a file first. 2. Identify the document path — all scripts expect this as input. 3. Check that `codex` is available: `which codex` ### Step 1: Section Extraction Split the document into reviewable sections. Use the `scripts/extract-sections.sh` script: ```bash python3 /path/to/codex-review/scripts/extract_sections.py <plan-file> ``` This creates a `.codex-review/sections/` directory with individual section files. Each file contains the section content plus minimal surrounding context (the document title and table of contents if present) so Codex can understand where the section fits. ### Step 2: Pass 1 — Section Reviews For each section, invoke Codex with the section review prompt: ```bash python3 /path/to/codex-review/scripts/review_section.py <section-file> [review-type] ``` Where `review-type` is one of: - `architecture` (default) — focuses on structural soundness, component boundaries, data flow - `implementation` — focuses on feasibility, ordering, dependencies, edge cases - `api` — focuses on interface contracts, backwards compatibility, error handling - `data` — focuses on data models, migrations, consistency, performance Each section review produces a structured feedback file in `.codex-review/feedback/pass1/`. **Review the pass 1 feedback with the user.** Present a summary of findings per section, categorized by severity: - 🔴 **Critical** — Blocking issues, correctness problems, missing requirements - 🟡 **Warning** — Gaps, potential issues, unclear specifications - 🔵 **Suggestion** — Improvements, alternatives worth considering **Revise the plan** based on pass 1 feedback before proceeding to pass 2. This is important — pass 2 should review the *improved* plan, not the original. ### Step 3: Pass 2 — Holistic Review After revisions, invoke the holistic review on the full (revised) document: ```bash python3 /path/to/codex-review/scripts/review_holistic.py <plan-file> <pass1-feedback-dir> ``` This pass specifically looks for: - Contradictions between sections - Missing integration points or handoff gaps - Unaddressed failure modes and error propagation - Implicit assumptions that aren't documented - Ordering and dependency issues across sections - Security and operational concerns The holistic review also receives the pass 1 feedback so it can verify that earlier issues were actually addressed. Output goes to `.codex-review/feedback/pass2/holistic-review.md`. ### Step 4: Final Integration Present the pass 2 findings to the user. Apply final revisions. The complete review trail is preserved in `.codex-review/` for reference. ## Configuration The skill respects these environment variables: | Variable | Default | Description | |----------|---------|-------------| | `CODEX_REVIEW_MODEL` | (codex default) | Override the Codex model for reviews | | `CODEX_REVIEW_TIMEOUT` | `120` | Timeout in seconds per review invocation | | `CODEX_REVIEW_VERBOSE` | `0` | Set to `1` to show Codex stderr output | ## Tips - For very large documents (>3000 words), section review is essential — Codex gives better feedback on focused chunks - The `implementation` review type is best for plans that will be fed to execute-plan - You can re-run just pass 2 after manual edits without redoing pass 1 - If Codex flags something you disagree with, note it as `[ACKNOWLEDGED]` — the holistic pass will see this and won't re-flag it - Keep the `.codex-review/` directory around — it's useful for understanding why decisions were made later ## Iterating on Subsections If pass 1 reveals major issues in a specific section, use the iteration script to do focused multi-round review using Codex session resume: ```bash python3 /path/to/codex-review/scripts/iterate_section.py <section-file> <revised-section-file> [review-type] [max-rounds] ``` **How it works:** 1. Round 1: Codex reviews the original section (creates a session) 2. Claude (or you) revises the section based on feedback 3. Round 2+: The script resumes the **same Codex session** via `codex exec resume --last`, passing the revised content. Codex evaluates whether its previous concerns were addressed, marks findings as RESOLVED or UNRESOLVED, and flags any new issues introduced by the revision. 4. Repeats until Codex responds with "✅ SECTION APPROVED" or `max-rounds` (default 3) is reached. **Interactive mode (default):** The script pauses between rounds and waits for ENTER after editing the revised file. Press `q` to stop early. **Non-interactive mode (`--no-interactive`):** Runs all rounds without pausing. Useful when Claude Code manages the edit-review loop externally. **Single-round mode (`--round N`):** Runs only round N. This is the best option when Claude Code drives the loop — it can run round 1, read feedback, revise the file, then run round 2, etc. Example Claude Code workflow: ```bash # Round 1 python3 scripts/iterate_section.py sections/03-data-model.md revised.md data --round 1 # Claude reads feedback, revises revised.md # Round 2 python3 scripts/iterate_section.py sections/03-data-model.md revised.md data --round 2 ``` **Convergence detection:** The script tracks issue counts across rounds. If issues stop decreasing after 3+ rounds, it warns that manual review may be needed. **Fallback:** If `codex exec resume --last` fails (e.g., session expired), the script falls back to a fresh `codex exec` with the revision prompt. This loses conversational context but still gets a review. All iteration feedback is preserved in `.codex-review/feedback/iterations/<section-name>/` with a summary file. ### Recommended Workflow For plans with mixed quality across sections: 1. Run pass 1 on all sections 2. Triage: identify sections with 🔴 critical findings 3. Use `iterate-section.sh` on critical sections until approved 4. Re-run pass 1 on remaining 🟡 warning sections if needed 5. Only then proceed to pass 2 holistic review This avoids burning a holistic review on a document with known local problems.
Related Skills
julien-workflow-advice-codex
Get OpenAI Codex CLI's opinion on code, bugs, or implementation. Use when you want a second AI perspective during coding sessions.
jetbrains-marketplace-reviews
Fetch and visualize reviews for any JetBrains Marketplace plugin. Use when (1) analyzing plugin review trends, (2) getting review statistics for a time period, (3) visualizing rating distributions, (4) monitoring user feedback. Triggers on requests like "get JetBrains reviews", "copilot plugin feedback", "JetBrains marketplace reviews", "visualize plugin ratings", "analyze JetBrains plugin reviews".
ethics-reviewer
This skill should be used when the user mentions "dark patterns", "accessibility", "a11y", "privacy", "tracking", "analytics", "notifications", "user data", "GDPR", "consent", "manipulation", "sustainability", "performance budget", or when building user-facing features that collect data, send notifications, display urgency, or gate access. Addresses ethical constraints in software design — manipulation, accessibility, privacy, and sustainability.
error-debugging-multi-agent-review
Use when working with error debugging multi agent review
datahub-connector-pr-review
This skill should be used when the user asks to "review my connector", "check my datahub connector", "review connector code", "audit connector", "review PR", "check code quality", or any request to review/check/audit a DataHub ingestion source. Covers compliance with standards, best practices, testing quality, and merge readiness.
cursor-rules-review
Audit Cursor IDE rules (.mdc files) against quality standards using a 5-gate review process. Validates frontmatter (YAML syntax, required fields, description quality, triggering configuration), glob patterns (specificity, performance, correctness), content quality (focus, organization, examples, cross-references), file length (under 500 lines recommended), and functionality (triggering, cross-references, maintainability). Use when reviewing pull requests with Cursor rule changes, conducting periodic rule quality audits, validating new rules before committing, identifying improvement opportunities, preparing rules for team sharing, or debugging why rules aren't working as expected.
cpm:review
Adversarial review of epic docs and stories. Agents from the party roster examine planning artifacts through their professional lens, challenging assumptions, spotting gaps, and flagging risks. Triggers on "/cpm:review".
contract-review-pro
专业合同审核 Skill,基于《合同审核方法论体系》提供合同类型指引和详细审核服务
consult-codex
Compare OpenAI Codex GPT-5.2 and code-searcher responses for comprehensive dual-AI code analysis. Use when you need multiple AI perspectives on code questions.
codex-team
Use when you have 2+ tasks that Codex agents should execute. Runtime-native: Codex sub-agents when available, Codex CLI fallback otherwise. Handles file conflicts via merge/wave strategies. Triggers: "codex team", "spawn codex", "codex agents", "use codex for", "codex fix".
codex
Run OpenAI's Codex CLI agent in non-interactive mode using `codex exec`. Use when delegating coding tasks to Codex, running Codex in scripts/automation, or when needing a second agent to work on a task in parallel.
codex-sessions-skill-scan
Daily skill health scan: analyze ~/.codex/sessions plus per-repo session logs under ~/dev (default last 1 day) and summarize skill invocations + likely failures for personal skills in ~/dev/agent-skills (missing paths, tool failures, complex-task word triggers). Optional: include best-effort local OTel signals.