agentv-prompt-optimizer
Iteratively optimize prompt files against AgentV evaluation datasets by analyzing failures and refining instructions.
Best use case
agentv-prompt-optimizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Iteratively optimize prompt files against AgentV evaluation datasets by analyzing failures and refining instructions.
Teams using agentv-prompt-optimizer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/agentv-prompt-optimizer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How agentv-prompt-optimizer Compares
| Feature / Agent | agentv-prompt-optimizer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Iteratively optimize prompt files against AgentV evaluation datasets by analyzing failures and refining instructions.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# AgentV Prompt Optimizer
## Input Variables
- `eval-path`: Path or glob pattern to the AgentV evaluation file(s) to optimize against
- `optimization-log-path` (optional): Path where optimization progress should be logged
## Workflow
1. **Initialize**
- Verify `<eval-path>` (file or glob) targets the correct system.
- **Identify Prompt Files**:
- Infer prompt files from the eval file content (look for `file:` references in `input` that match these patterns).
- Recursively check referenced prompt files for *other* prompt references (dependencies).
- If multiple prompts are found, consider ALL of them as candidates for optimization.
- **Identify Optimization Log**:
- If `<optimization-log-path>` is provided, use it.
- If not, create a new one in the parent directory of the eval files: `optimization-[timestamp].md`.
- Read content of the identified prompt file.
2. **Optimization Loop** (Max 10 iterations)
- **Execute (The Generator)**: Run `agentv eval <eval-path>`.
- *Targeted Run*: If iterating on specific stubborn failures, use `--test-id <test_id>` to run only the relevant tests.
- **Analyze (The Reflector)**:
- Locate the results file path from the console output (e.g., `.agentv/results/eval_...jsonl`).
- **Orchestrate Subagent**: Use `runSubagent` to analyze the results.
- **Task**: Read the results file, calculate pass rate, and perform root cause analysis.
- **Output**: Return a structured analysis including:
- **Score**: Current pass rate.
- **Root Cause**: Why failures occurred (e.g., "Ambiguous definition", "Hallucination").
- **Insight**: Key learning or pattern identified from the failures.
- **Strategy**: High-level plan to fix the prompt (e.g., "Clarify section X", "Add negative constraint").
- **Decide**:
- If **100% pass**: STOP and report success.
- If **Score decreased**: Revert last change, try different approach.
- If **No improvement** (2x): STOP and report stagnation.
- **Refine (The Curator)**:
- **Orchestrate Subagent**: Use `runSubagent` to apply the fix.
- **Task**: Read the relevant prompt file(s), apply the **Strategy** from the Reflector, and generate the log entry.
- **Output**: The **Log Entry** describing the specific operation performed.
```markdown
### Iteration [N]
- **Operation**: [ADD / UPDATE / DELETE]
- **Target**: [Section Name]
- **Change**: [Specific text added/modified]
- **Trigger**: [Specific failing test case or error pattern]
- **Rationale**: [From Reflector: Root Cause]
- **Score**: [From Reflector: Current Pass Rate]
- **Insight**: [From Reflector: Key Learning]
```
- **Strategy**: Treat the prompt as a structured set of rules. Execute atomic operations:
- **ADD**: Insert a new rule if a constraint was missed.
- **UPDATE**: Refine an existing rule to be clearer or more general.
- *Clarify*: Make ambiguous instructions specific.
- *Generalize*: Refactor specific fixes into high-level principles (First Principles).
- **DELETE**: Remove obsolete, redundant, or harmful rules.
- *Prune*: If a general rule covers specific cases, delete the specific ones.
- **Negative Constraint**: If hallucinating, explicitly state what NOT to do. Prefer generalized prohibitions over specific forbidden tokens where possible.
- **Safety Check**: Ensure new rules don't contradict existing ones (unless intended).
- **Constraint**: Avoid rewriting large sections. Make surgical, additive changes to preserve existing behavior.
- **Log Result**:
- Append the **Log Entry** returned by the Curator to the optimization log file.
3. **Completion**
- Report final score.
- Summarize key changes made to the prompt.
- **Finalize Optimization Log**: Add a summary header to the optimization log file indicating the session completion and final score.
## Guidelines
- **Generalization First**: Prefer broad, principle-based guidelines over specific examples or "hotfixes". Only use specific rules if generalized instructions fail to achieve the desired score.
- **Simplicity ("Less is More")**: Avoid overfitting to the test set. If a specific rule doesn't significantly improve the score compared to a general one, choose the general one.
- **Structure**: Maintain existing Markdown headers/sections.
- **Progressive Disclosure**: If the prompt grows too large (>200 lines), consider moving specialized logic into a separate file or skill.
- **Quality Criteria**: Ensure the prompt defines a clear persona, specific task, and measurable success criteria.Related Skills
create-prompt
Expert prompt engineering for creating effective prompts for Claude, GPT, and other LLMs. Use when writing system prompts, user prompts, few-shot examples, or optimizing existing prompts for better performance.
create-custom-prompt
Prompt for creating custom prompt files
context-optimizer
Analyzes Copilot Chat debug logs, agent definitions, skills, and instruction files to audit context window utilization. Provides log parsing, turn-cost profiling, redundancy detection, hand-off gap analysis, and optimization recommendations. Use when optimizing agent context efficiency, identifying where to add subagent hand-offs, or reducing token waste across agent systems.
image-optimizer
Optimize and compress images for web use. Reduces file sizes of JPEG, PNG, GIF images using lossy/lossless compression. Can resize images to maximum dimensions, convert to WebP format, and process entire directories recursively. Use when images are too large for web, need compression, or need format conversion.
article-title-optimizer
This skill analyzes article content in-depth and generates optimized, marketable titles in the format 'Title: Subtitle' (10-12 words maximum). The skill should be used when users request title optimization, title generation, or title improvement for articles, blog posts, or written content. It generates 5 title candidates using proven formulas, evaluates them against success criteria (clickability, SEO, clarity, emotional impact, memorability, shareability), and replaces the article's title with the winning candidate.
prompt-engineer
Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW)
seo-meta-optimizer
Creates optimized meta titles, descriptions, and URL suggestions based on character limits and best practices. Generates compelling, keyword-rich metadata. Use PROACTIVELY for new content.
internal-linking-optimizer
Use when the user asks to "fix internal links", "improve site architecture", "link structure", "distribute page authority", "internal linking strategy", "orphan pages", "site architecture is messy", or "pages have no links pointing to them". Analyzes and optimizes internal link structure to improve site architecture, distribute page authority, and help search engines understand content relationships. Creates strategic internal linking plans. For a broader on-page audit, see on-page-seo-auditor. For external link analysis, see backlink-analyzer.
python-fastapi-scalable-api-cursorrules-prompt-fil
Apply for python-fastapi-scalable-api-cursorrules-prompt-fil. --- description: Defines conventions specific to FastAPI usage in the backend. globs: backend/src/**/*.py
python-django-best-practices-cursorrules-prompt-fi
Apply for python-django-best-practices-cursorrules-prompt-fi. --- description: Configurations for Django settings file with the list of dependencies and conventions. globs: **/settings.py
performance-optimizer
Performance analysis, profiling techniques, bottleneck identification, and optimization strategies for code and systems. Use when the user needs to improve performance, reduce resource usage, or identify and fix performance bottlenecks.
go-servemux-rest-api-cursorrules-prompt-file
Apply for go-servemux-rest-api-cursorrules-prompt-file. --- description: This rule emphasizes security, scalability, and maintainability best practices in Go API development. globs: /*/**/*_api.go