codex

Run OpenAI's Codex CLI agent in non-interactive mode using `codex exec`. Use when delegating coding tasks to Codex, running Codex in scripts/automation, or when needing a second agent to work on a task in parallel.

148 stars

Best use case

codex is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Run OpenAI's Codex CLI agent in non-interactive mode using `codex exec`. Use when delegating coding tasks to Codex, running Codex in scripts/automation, or when needing a second agent to work on a task in parallel.

Teams using codex should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/codex/SKILL.md --create-dirs "https://raw.githubusercontent.com/sundial-org/skills/main/skills/codex/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/codex/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How codex Compares

Feature / AgentcodexStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Run OpenAI's Codex CLI agent in non-interactive mode using `codex exec`. Use when delegating coding tasks to Codex, running Codex in scripts/automation, or when needing a second agent to work on a task in parallel.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Codex CLI (Non-Interactive)

Codex is OpenAI's coding agent. Use `codex exec` to run it non-interactively from any cli agent.

## When to Use Codex

Use Codex when:
- **Parallel work**: Delegate a task while continuing other work
- **Second opinion**: Get an independent implementation or review
- **Long-running tasks**: Offload tasks that may take many iterations
- **Code review**: Use `codex exec review` for PR/diff reviews

Do NOT use Codex for:
- Simple file reads/edits you can do directly
- Tasks requiring back-and-forth conversation
- Tasks needing your current context

## Quick Reference

```bash
# Analysis (read-only, default)
codex exec "describe the architecture of this codebase"

# Allow file edits
codex exec --full-auto "fix the failing tests"

# Code review
codex exec review --uncommitted
codex exec review --base main

# Structured JSON output
codex exec --output-schema schema.json -o result.json "extract metadata"

# Continue previous session (inherits original sandbox settings)
codex exec resume --last "now add tests"
```

## Core Concepts

### Output Streams

Progress goes to stderr, final result to stdout. To capture only the result:
```bash
codex exec "summarize the repo" 2>/dev/null > summary.txt
```

To see progress while capturing result:
```bash
codex exec "generate changelog" 2>&1 | tee output.txt
```

### Sandbox Modes

In non-interactive mode, **no approval prompts are possible**. Permissions must be set upfront:

| Mode | Flag | Behavior |
|------|------|----------|
| Read-only | (default) | Reads anywhere, writes/commands blocked |
| Workspace-write | `--full-auto` | Pre-approves edits and commands in workspace |
| Full access | `--yolo` | No restrictions. Use in isolated environments only |

**Choose based on task:**
- Analysis/explanation → default (read-only)
- Fix bugs/implement features → `--full-auto`
- Needs network or system access → `--yolo` (dangerous)

Note: `~/.codex/config.toml` can set project trust levels that override defaults.

### Models

Default model is `gpt-5.2-codex`. Override with `-m`:
```bash
codex exec -m gpt-5 "explain this code"
```

### Authentication

By default, the user should already be authenticated. If not, set `CODEX_API_KEY`:
```bash
CODEX_API_KEY=sk-... codex exec "task"
```

## Code Review

Built-in review subcommand:
```bash
# Review uncommitted changes
codex exec review --uncommitted

# Review against a base branch
codex exec review --base main

# Review a specific commit
codex exec review --commit abc123
```

## Structured Output

Use `--output-schema` for JSON output. **Important**: OpenAI requires `additionalProperties: false` on all object types.

```bash
codex exec --output-schema schema.json -o result.json "extract API endpoints"
```

Schema example:
```json
{
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "endpoints": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "path": { "type": "string" },
          "method": { "type": "string" }
        },
        "required": ["path", "method"],
        "additionalProperties": false
      }
    }
  },
  "required": ["name", "endpoints"],
  "additionalProperties": false
}
```

## Session Resume

Resume continues a previous session, **inheriting its sandbox settings**:
```bash
# Start a task
codex exec --full-auto "implement rate limiter"

# Continue later (inherits --full-auto from original)
codex exec resume --last "add unit tests"

# Or resume by session ID
codex exec resume <SESSION_ID> "follow-up task"
```

**Note**: Cannot pass `--full-auto` to resume; it inherits from the original session.

## JSONL Event Stream

For programmatic use, `--json` outputs structured events:
```bash
codex exec --json "analyze code" 2>/dev/null | jq -c 'select(.type == "item.completed")'
```

## Performance & Best Practices

### Execution Time
Complex tasks typically take **60-120+ seconds**. Simple analysis tasks complete in 10-30 seconds.

- Tasks may continue executing even after your timeout
- Always check if files were modified regardless of timeout status
- Use `tail -f <output_file>` to monitor long-running background tasks

### Task Granularity
Break complex work into focused tasks:

```bash
# Good: Focused, single-purpose tasks
codex exec --full-auto "add star ratings to the skill cards"
codex exec --full-auto "add a search filter to the toolbar"

# Avoid: Multi-feature requests in one task
codex exec --full-auto "add ratings, search, filters, modal, and animations"
```

### Concurrent Editing
**Avoid running multiple Codex sessions on the same file simultaneously.** While it may work, concurrent edits risk merge conflicts or overwrites.

### Large Files
Files over ~2000 lines slow execution as Codex reads the entire file multiple times. Consider:
- Splitting into multiple files when possible
- Using specific line references in prompts
- Breaking incremental changes into smaller tasks

## Error Handling

Common errors:
- **Timeout**: Long tasks may timeout. Check if work completed anyway, or use resume to continue.
- **Sandbox blocked**: Task needs writes but running in read-only. Use `--full-auto`.
- **Schema validation**: Missing `additionalProperties: false` in schema objects.
- **Model not supported**: Some models unavailable with ChatGPT auth. Use default `gpt-5.2-codex`.

## Detailed References

- **[exec-reference.md](references/exec-reference.md)**: Complete flag reference
- **[prompting.md](references/prompting.md)**: Effective prompts and workflow patterns

## Project Context

Codex reads `AGENTS.md` files for project instructions:
- `~/.codex/AGENTS.md` - Global defaults
- `<repo>/AGENTS.md` - Project-specific

```markdown
# AGENTS.md
- Run `npm test` after modifying JS files
- Use pnpm for dependencies
```

Related Skills

cs448b-visualization

148
from sundial-org/skills

Data visualization design based on Stanford CS448B. Use for: (1) choosing chart types, (2) selecting visual encodings, (3) critiquing visualizations, (4) building D3.js visualizations, (5) designing interactions/animations, (6) choosing colors, (7) visualizing networks, (8) visualizing text. Covers Bertin, Mackinlay, Cleveland & McGill.

training-data-curation

148
from sundial-org/skills

Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.

tinker

148
from sundial-org/skills

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

tinker-training-cost

148
from sundial-org/skills

Calculate training costs for Tinker fine-tuning jobs. Use when estimating costs for Tinker LLM training, counting tokens in datasets, or comparing Tinker model training prices. Tokenizes datasets using the correct model tokenizer and provides accurate cost estimates.

skill

148
from sundial-org/skills

Find, install, create, improve, and publish AI agent skills through the Sundial ecosystem. Use when the user wants to find or search for skills, install a skill, create a new skill, improve or evaluate an existing skill, or publish a skill to Sundial Hub. Trigger phrases include "find a skill", "install skill", "create a skill", "make a skill", "improve this skill", "evaluate skill", "publish skill", "push skill", "search for skills".

skill-to-card

148
from sundial-org/skills

End-to-end workflow that creates a skill from a description and attached files, publishes it to Sundial as a private skill, generates a trading card (front + back with QR code), and sends it to a printer. Use when the user wants to create a skill and get a printed trading card, or says "skill to card", "create and print a skill card", "make me a skill with a card".

project-referee

148
from sundial-org/skills

Critiques ML conference papers with reviewer-style feedback. Use when users want to anticipate reviewer concerns, identify weaknesses, check claim-evidence gaps, or find missing citations.

neuro-symbolic-reasoning

148
from sundial-org/skills

Neuro-symbolic AI combining LLMs with symbolic solvers. Use when exploring neuro-symbolic approaches (ideation, no code) or implementing solver integrations (code).

icml-reviewer

148
from sundial-org/skills

Paper reviewer that evaluates machine learning research projects following official ICML reviewer guidelines. Provides comprehensive reviews with actionable feedback across all key dimensions: claims/evidence, relation to prior work, originality, significance, clarity, and reproducibility. Also provides formative feedback on incomplete drafts, proposals, and research code repositories. MANDATORY TRIGGERS: review paper, ICML review, paper review, evaluate paper, research paper feedback, ML paper review, conference review, academic review, paper critique, NeurIPS review, ICLR review, project proposal, research proposal, paper draft, early feedback, incomplete paper, work in progress, WIP review, review repo, review codebase, research project review

cs-research-methodology

148
from sundial-org/skills

Conduct a literature review and develop a CS research proposal. Use when asked to review a research area, find gaps in existing work, and propose a novel research contribution. The output is a research proposal identifying an assumption to challenge (the "bit flip") and how to validate it.

commit-splitter

148
from sundial-org/skills

Split large sets of uncommitted changes into logical, well-organized commits. Use when the user has many uncommitted changes and wants structured commits, or proactively suggest when detecting a large diff that would benefit from splitting.

ai-co-scientist

148
from sundial-org/skills

Transform Claude Code into an AI Scientist that orchestrates research workflows using tree-based hypothesis exploration. Triggers on "research project", "scientific experiment", "run experiments", "AI scientist", "tree search experimentation", "systematic study".