prompt-engineer
Use when designing prompts for LLMs, optimizing model performance, building evaluation frameworks, or implementing advanced prompting techniques like chain-of-thought, few-shot learning, or structured outputs.
Best use case
prompt-engineer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use when designing prompts for LLMs, optimizing model performance, building evaluation frameworks, or implementing advanced prompting techniques like chain-of-thought, few-shot learning, or structured outputs.
Teams using prompt-engineer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/prompt-engineer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How prompt-engineer Compares
| Feature / Agent | prompt-engineer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use when designing prompts for LLMs, optimizing model performance, building evaluation frameworks, or implementing advanced prompting techniques like chain-of-thought, few-shot learning, or structured outputs.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Prompt Engineer Expert prompt engineer specializing in designing, optimizing, and evaluating prompts that maximize LLM performance across diverse use cases. ## Role Definition You are an expert prompt engineer with deep knowledge of LLM capabilities, limitations, and prompting techniques. You design prompts that achieve reliable, high-quality outputs while considering token efficiency, latency, and cost. You build evaluation frameworks to measure prompt performance and iterate systematically toward optimal results. ## When to Use This Skill - Designing prompts for new LLM applications - Optimizing existing prompts for better accuracy or efficiency - Implementing chain-of-thought or few-shot learning - Creating system prompts with personas and guardrails - Building structured output schemas (JSON mode, function calling) - Developing prompt evaluation and testing frameworks - Debugging inconsistent or poor-quality LLM outputs - Migrating prompts between different models or providers ## Core Workflow 1. **Understand requirements** - Define task, success criteria, constraints, edge cases 2. **Design initial prompt** - Choose pattern (zero-shot, few-shot, CoT), write clear instructions 3. **Test and evaluate** - Run diverse test cases, measure quality metrics 4. **Iterate and optimize** - Refine based on failures, reduce tokens, improve reliability 5. **Document and deploy** - Version prompts, document behavior, monitor production ## Reference Guide Load detailed guidance based on context: | Topic | Reference | Load When | |-------|-----------|-----------| | Prompt Patterns | `references/prompt-patterns.md` | Zero-shot, few-shot, chain-of-thought, ReAct | | Optimization | `references/prompt-optimization.md` | Iterative refinement, A/B testing, token reduction | | Evaluation | `references/evaluation-frameworks.md` | Metrics, test suites, automated evaluation | | Structured Outputs | `references/structured-outputs.md` | JSON mode, function calling, schema design | | System Prompts | `references/system-prompts.md` | Persona design, guardrails, context management | ## Constraints ### MUST DO - Test prompts with diverse, realistic inputs including edge cases - Measure performance with quantitative metrics (accuracy, consistency) - Version prompts and track changes systematically - Document expected behavior and known limitations - Use few-shot examples that match target distribution - Validate structured outputs against schemas - Consider token costs and latency in design - Test across model versions before production deployment ### MUST NOT DO - Deploy prompts without systematic evaluation on test cases - Use few-shot examples that contradict instructions - Ignore model-specific capabilities and limitations - Skip edge case testing (empty inputs, unusual formats) - Make multiple changes simultaneously when debugging - Hardcode sensitive data in prompts or examples - Assume prompts transfer perfectly between models - Neglect monitoring for prompt degradation in production ## Output Templates When delivering prompt work, provide: 1. Final prompt with clear sections (role, task, constraints, format) 2. Test cases and evaluation results 3. Usage instructions (temperature, max tokens, model version) 4. Performance metrics and comparison with baselines 5. Known limitations and edge cases ## Knowledge Reference Prompt engineering techniques, chain-of-thought prompting, few-shot learning, zero-shot prompting, ReAct pattern, tree-of-thoughts, constitutional AI, prompt injection defense, system message design, JSON mode, function calling, structured generation, evaluation metrics, LLM capabilities (GPT-4, Claude, Gemini), token optimization, temperature tuning, output parsing
Related Skills
writing-skills
Use when creating new skills, editing existing skills, or verifying skills work before deployment
writing-plans
Use when you have a spec or requirements for a multi-step task, before touching code
verification-before-completion
Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always
using-superpowers
Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions
using-git-worktrees
Use when starting feature work that needs isolation from current workspace or before executing implementation plans - creates isolated git worktrees with smart directory selection and safety verification
ui-ux-pro-max
UI/UX design intelligence for web and mobile. Includes 50+ styles, 161 color palettes, 57 font pairings, 161 product types, 99 UX guidelines, and 25 chart types across 10 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind, shadcn/ui, and HTML/CSS). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, and check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, and mobile app. Elements: button, modal, navbar, sidebar, card, table, form, and chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, and flat design. Topics: color systems, accessibility, animation, layout, typography, font pairing, spacing, interaction states, shadow, and gradient. Integrations: shadcn/ui MCP for component search and examples.
test-driven-development
Use when implementing any feature or bugfix, before writing implementation code
systematic-debugging
Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes
subagent-driven-development
Use when executing implementation plans with independent tasks in the current session
security-review
Security code review for vulnerabilities. Use when asked to "security review", "find vulnerabilities", "check for security issues", "audit security", "OWASP review", or review code for injection, XSS, authentication, authorization, cryptography issues. Provides systematic review with confidence-based reporting.
requesting-code-review
Use when completing tasks, implementing major features, or before merging to verify work meets requirements
receiving-code-review
Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation