prompt-engineering

Prompt engineering techniques for LLMs — zero-shot, few-shot, chain-of-thought, ReAct, and structured prompting. Use when designing prompts for AI features, improving LLM output quality, building reliable AI pipelines, or getting consistent structured responses from language models.

26 stars

byTerminalSkills

View on GitHub Installation ↓

Best use case

prompt-engineering is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using prompt-engineering should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/prompt-engineering/SKILL.md --create-dirs "https://raw.githubusercontent.com/TerminalSkills/skills/main/skills/prompt-engineering/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/prompt-engineering/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How prompt-engineering Compares

Feature / Agent	prompt-engineering	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Prompt Engineering

## Overview

Prompt engineering is the practice of crafting inputs to language models to reliably produce desired outputs. Good prompts reduce hallucinations, increase consistency, and unlock model capabilities. This skill covers the key techniques: zero-shot, few-shot, chain-of-thought (CoT), Tree-of-Thought (ToT), ReAct, self-consistency, and meta-prompting.

## Core Techniques

### Zero-Shot Prompting

No examples — rely on the model's training. Works well for clear, simple tasks.

```python
prompt = """Classify the sentiment of the following review as POSITIVE, NEGATIVE, or NEUTRAL.

Review: "The delivery was fast but the packaging was damaged."

Sentiment:"""
```

### Few-Shot Prompting

Provide 2–5 examples to guide the model's output format and style.

```python
prompt = """Classify sentiment. Examples:

Review: "Amazing product, works perfectly!" → POSITIVE
Review: "Arrived broken, waste of money." → NEGATIVE
Review: "It's okay, nothing special." → NEUTRAL

Review: "The battery life is shorter than advertised."
Sentiment:"""
```

**Tips:**
- Use diverse, representative examples
- Keep examples consistent in format
- 3–5 examples usually optimal; more can hurt via distraction
- Put examples before the actual input

### Chain-of-Thought (CoT)

Ask the model to reason step-by-step before answering. Dramatically improves accuracy on math, logic, and multi-step tasks.

```python
# Zero-shot CoT — just add "Let's think step by step"
prompt = """A store sells apples for $0.50 each and oranges for $0.75 each.
Alice buys 4 apples and 3 oranges. How much does she spend?

Let's think step by step."""

# Few-shot CoT — include reasoning in examples
prompt = """Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.
Each can has 3 balls. How many tennis balls does he have now?
A: Roger starts with 5 balls. 2 cans × 3 balls = 6 balls. 5 + 6 = 11. The answer is 11.

Q: Alice has 10 apples. She gives 3 to Bob and 2 to Charlie. How many does she have?
A:"""
```

### Tree-of-Thought (ToT)

Generate multiple reasoning paths, evaluate them, and pick the best. Useful for creative or open-ended problems.

```python
prompt = """Think of 3 different approaches to solve this problem, evaluate each briefly,
then pick the best one and execute it.

Problem: Design a caching strategy for an API that has both frequently-accessed
stable data and rapidly-changing user-specific data.

Approach 1:
Approach 2:
Approach 3:

Best approach and implementation:"""
```

### ReAct (Reason + Act)

Interleave reasoning (Thought) with actions (Action/Observation) in a loop. Foundation of tool-using agents.

```python
system = """You solve tasks by alternating between Thought, Action, and Observation.
Available actions: search(query), calculate(expression), done(answer)

Format:
Thought: [your reasoning]
Action: [action to take]
Observation: [result of action]
... (repeat as needed)
Thought: I now have the answer.
Action: done([final answer])"""

user = "What is the square root of the population of Tokyo?"
```

### Self-Consistency

Generate multiple independent answers, then pick the most common one. Improves reliability on reasoning tasks.

```python
import asyncio

async def self_consistent_answer(question, n=5):
    """Generate N answers and pick by majority vote."""
    prompts = [f"{question}\n\nThink step by step and give your final answer." for _ in range(n)]
    answers = await asyncio.gather(*[call_llm(p) for p in prompts])
    # Extract final answers and find most common
    final_answers = [extract_answer(a) for a in answers]
    return max(set(final_answers), key=final_answers.count)
```

### Meta-Prompting

Use LLMs to generate or improve prompts for other LLMs.

```python
meta_prompt = """You are an expert prompt engineer. Create an optimized prompt for the following task.

Task: {task_description}
Target model: {model_name}
Desired output format: {output_format}

Generate a prompt that:
1. Clearly specifies the task
2. Includes necessary context
3. Defines output format precisely
4. Handles edge cases

Optimized prompt:"""
```

## Structured Prompting

### XML Tags (Claude-Optimized)

Claude responds especially well to XML-tagged content sections.

```python
prompt = """<task>
Extract all product names and prices from the following receipt text.
</task>

<format>
Return a JSON array: [{"name": "...", "price": 0.00}]
</format>

<receipt>
{receipt_text}
</receipt>

JSON output:"""
```

### Role Assignment

```python
system = """You are a senior Python engineer specializing in performance optimization.
You write clean, well-documented code with O(n) complexity analysis.
When reviewing code, always:
1. Identify bottlenecks
2. Suggest specific optimizations
3. Provide rewritten examples"""
```

### Delimiter-Based Prompts

```python
prompt = """Summarize the text between triple backticks in 2-3 sentences.

```
{text_to_summarize}
```

Summary:"""
```

## Prompt Templates

### Reusable Template Class

```python
class PromptTemplate:
    def __init__(self, template: str, required_vars: list[str]):
        self.template = template
        self.required_vars = required_vars

    def format(self, **kwargs) -> str:
        missing = [v for v in self.required_vars if v not in kwargs]
        if missing:
            raise ValueError(f"Missing variables: {missing}")
        return self.template.format(**kwargs)

# Example usage
extraction_template = PromptTemplate(
    template="""Extract {field} from the following {doc_type}.

<document>
{document}
</document>

Return only the extracted {field}, nothing else.""",
    required_vars=["field", "doc_type", "document"]
)

prompt = extraction_template.format(
    field="email addresses",
    doc_type="email thread",
    document=email_text
)
```

## Evaluation: Testing Prompts Systematically

```python
import json
from dataclasses import dataclass

@dataclass
class TestCase:
    input: str
    expected: str
    description: str

def evaluate_prompt(prompt_template: str, test_cases: list[TestCase], llm_fn):
    results = []
    for tc in test_cases:
        prompt = prompt_template.format(input=tc.input)
        actual = llm_fn(prompt)
        passed = tc.expected.lower() in actual.lower()
        results.append({
            "description": tc.description,
            "passed": passed,
            "expected": tc.expected,
            "actual": actual
        })

    accuracy = sum(r["passed"] for r in results) / len(results)
    print(f"Accuracy: {accuracy:.0%} ({sum(r['passed'] for r in results)}/{len(results)})")

    failed = [r for r in results if not r["passed"]]
    if failed:
        print("\nFailed cases:")
        for f in failed:
            print(f"  - {f['description']}: expected '{f['expected']}', got '{f['actual'][:100]}'")

    return results
```

## Model-Specific Differences

### Claude (Anthropic)
- Responds very well to XML tags (`<task>`, `<context>`, `<format>`)
- Prefers explicit, detailed instructions over implicit expectations
- Honors "do not" instructions reliably
- Works well with `<thinking>` tags for CoT (extended thinking)
- System prompt sets persona/constraints; user prompt is the task

### GPT-4 (OpenAI)
- Works well with markdown headers and bullet lists in prompts
- Strong at following JSON schema when given explicit examples
- `response_format: { type: "json_object" }` enforces JSON output
- Temperature 0 for deterministic tasks; 0.7 for creative work

### Gemini (Google)
- Performs best with clear, concise instructions
- Multimodal: can process images/PDFs natively in prompts
- Use `generationConfig.responseMimeType = "application/json"` for JSON output
- Strong instruction-following with numbered steps

## Guidelines

- Start simple (zero-shot) and add complexity only if needed
- Be explicit about output format — show a JSON example if you want JSON
- Use system prompt for persona/constraints, user prompt for the actual task
- Test prompts on adversarial inputs (edge cases, contradictions, empty inputs)
- Version-control your prompts like code — track changes and metrics
- Shorter prompts are usually faster and cheaper; add length only when it helps accuracy
- For extraction tasks, always specify what to return when the field is not found

Related Skills

prompts-chat

from TerminalSkills/skills

Browse, search, and self-host a community prompt library with 1000+ curated prompts. Use when: finding proven prompts for specific tasks, building a team prompt library, learning prompt patterns from community-tested examples.

promptfoo

from TerminalSkills/skills

Test and evaluate LLM prompts systematically with Promptfoo — open-source eval framework. Use when someone asks to "test my prompts", "evaluate LLM output", "Promptfoo", "prompt regression testing", "compare LLM models", "LLM evaluation framework", or "benchmark prompts against test cases". Covers test cases, assertions, model comparison, red-teaming, and CI integration.

prompt-tester

from TerminalSkills/skills

Design, test, and iterate on AI prompts systematically using structured evaluation criteria. Use when building AI features, optimizing agent instructions, comparing prompt variants, or evaluating output quality across edge cases. Trigger words: prompt engineering, prompt testing, eval, LLM evaluation, prompt comparison, A/B test prompts, prompt optimization, system prompt, instruction tuning.

context-engineering

from TerminalSkills/skills

Optimizes agent context setup for maximum output quality. Use when starting a new session, when agent output quality degrades, when switching between tasks, or when configuring rules files and context for AI-assisted development. Covers the context hierarchy, packing strategies, and confusion management.

zustand

from TerminalSkills/skills

You are an expert in Zustand, the small, fast, and scalable state management library for React. You help developers manage global state without boilerplate using Zustand's hook-based stores, selectors for performance, middleware (persist, devtools, immer), computed values, and async actions — replacing Redux complexity with a simple, un-opinionated API in under 1KB.

zoho

from TerminalSkills/skills

Integrate and automate Zoho products. Use when a user asks to work with Zoho CRM, Zoho Books, Zoho Desk, Zoho Projects, Zoho Mail, or Zoho Creator, build custom integrations via Zoho APIs, automate workflows with Deluge scripting, sync data between Zoho apps and external systems, manage leads and deals, automate invoicing, build custom Zoho Creator apps, set up webhooks, or manage Zoho organization settings. Covers Zoho CRM, Books, Desk, Projects, Creator, and cross-product integrations.

zod

from TerminalSkills/skills

You are an expert in Zod, the TypeScript-first schema declaration and validation library. You help developers define schemas that validate data at runtime AND infer TypeScript types at compile time — eliminating the need to write types and validators separately. Used for API input validation, form validation, environment variables, config files, and any data boundary.

zipkin

from TerminalSkills/skills

Deploy and configure Zipkin for distributed tracing and request flow visualization. Use when a user needs to set up trace collection, instrument Java/Spring or other services with Zipkin, analyze service dependencies, or configure storage backends for trace data.

zig

from TerminalSkills/skills

Expert guidance for Zig, the systems programming language focused on performance, safety, and readability. Helps developers write high-performance code with compile-time evaluation, seamless C interop, no hidden control flow, and no garbage collector. Zig is used for game engines, operating systems, networking, and as a C/C++ replacement.

zed

from TerminalSkills/skills

Expert guidance for Zed, the high-performance code editor built in Rust with native collaboration, AI integration, and GPU-accelerated rendering. Helps developers configure Zed, create custom extensions, set up collaborative editing sessions, and integrate AI assistants for productive coding.

zeabur

from TerminalSkills/skills

Expert guidance for Zeabur, the cloud deployment platform that auto-detects frameworks, builds and deploys applications with zero configuration, and provides managed services like databases and message queues. Helps developers deploy full-stack applications with automatic scaling and one-click marketplace services.

zapier

from TerminalSkills/skills

Automate workflows between apps with Zapier. Use when a user asks to connect apps without code, automate repetitive tasks, sync data between services, or build no-code integrations between SaaS tools.