Codex

execute-feedback

Execute tests on generated code and iterate until passing

104 stars

byjmagly

View on GitHub Installation ↓

Best use case

execute-feedback is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

It is a strong fit for teams already working in Codex.

Execute tests on generated code and iterate until passing

Teams using execute-feedback should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/execute-feedback/SKILL.md --create-dirs "https://raw.githubusercontent.com/jmagly/aiwg/main/.agents/skills/execute-feedback/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/execute-feedback/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How execute-feedback Compares

Feature / Agent	execute-feedback	Standard Approach
Platform Support	Codex	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Execute tests on generated code and iterate until passing

Which AI agents support this skill?

This skill is designed for Codex.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

SKILL.md Source

# Execute Feedback Command

Run executable feedback loop on generated code: execute tests, analyze failures, fix, and retry.

## Instructions

When invoked, perform the executable feedback loop per REF-013 MetaGPT:

1. **Identify Target**
   - Load the specified file or recently modified code files
   - Determine test framework (jest, pytest, cargo test, go test, etc.)
   - Find existing tests or generate test stubs if none exist

2. **Execute Tests**
   - Run the specified test command (or auto-detect)
   - Capture full output (stdout, stderr, exit code)
   - Parse test results: passed, failed, errors, skipped

3. **Analyze Failures**
   - For each failing test:
     - Extract error type and message
     - Identify root cause (null check, type error, logic error, etc.)
     - Map to source code location
   - Check debug memory for similar past failures

4. **Apply Fixes**
   - Generate targeted fix based on root cause analysis
   - Apply fix to source code
   - Increment attempt counter

5. **Re-Execute**
   - Run tests again after fix
   - Compare results to previous attempt
   - If all pass: record success in debug memory, return
   - If still failing: repeat from step 3

6. **Escalate if Needed**
   - After max attempts (default: 3), escalate to human
   - Include: all test results, failure analyses, fix attempts
   - Save debug memory session

7. **Update Debug Memory**
   - Record execution session in `.aiwg/ralph/debug-memory/sessions/`
   - Extract learned patterns to `.aiwg/ralph/debug-memory/patterns/`
   - Update success metrics

## Arguments

- `[file-path]` - Source file to test (default: recently modified files)
- `--test-command [cmd]` - Test command to run (default: auto-detect)
- `--max-attempts [n]` - Maximum fix attempts (default: 3)
- `--coverage [%]` - Minimum coverage target (default: 80)
- `--no-fix` - Run tests only, report without fixing
- `--verbose` - Show full test output

## References

- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/executable-feedback.md - Executable feedback rules
- @$AIWG_ROOT/agentic/code/addons/ralph/docs/executable-feedback-guide.md - Implementation guide
- @$AIWG_ROOT/agentic/code/addons/ralph/schemas/debug-memory.yaml - Debug memory schema
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/flows/executable-feedback.yaml - Workflow schema
- @.aiwg/research/findings/REF-013-metagpt.md - Research foundation

Related Skills

uat-execute

104

from jmagly/aiwg

Execute a UAT plan against live MCP connections, tracking pass/fail per test and filing issues on failure

Codex

feedback

104

from jmagly/aiwg

Submit a bug report, feature request, or feedback to the AIWG GitHub repository — prefills system context automatically

Codex

aiwg-orchestrate

104

from jmagly/aiwg

Route structured artifact work to AIWG workflows via MCP with zero parent context cost

venv-manager

104

from jmagly/aiwg

Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.

pytest-runner

104

from jmagly/aiwg

Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.

vitest-runner

104

from jmagly/aiwg

Execute JavaScript/TypeScript tests with Vitest, supporting coverage, watch mode, and parallel execution. Use for JS/TS test automation.

eslint-checker

104

from jmagly/aiwg

Run ESLint for JavaScript/TypeScript code quality and style enforcement. Use for static analysis and auto-fixing.

repo-analyzer

104

from jmagly/aiwg

Analyze GitHub repositories for structure, documentation, dependencies, and contribution patterns. Use for codebase understanding and health assessment.

pr-reviewer

104

from jmagly/aiwg

Review GitHub pull requests for code quality, security, and best practices. Use for automated PR feedback and approval workflows.

YouTube Acquisition

104

from jmagly/aiwg

yt-dlp patterns for acquiring content from YouTube and video platforms

Quality Filtering

104

from jmagly/aiwg

Accept/reject logic and quality scoring heuristics for media content

Provenance Tracking

104

from jmagly/aiwg

W3C PROV-O patterns for tracking media derivation chains and production history