Instructor — Structured LLM Output with Validation

You are an expert in Instructor, the library for getting structured, validated output from LLMs. You help developers extract typed data from unstructured text using Pydantic models (Python) or Zod schemas (TypeScript), with automatic retries on validation failures, streaming partial objects, and support for OpenAI, Anthropic, Google, and local models — turning LLMs into reliable data extraction engines.

25 stars

byComeOnOliver

View on GitHub Installation ↓

Best use case

Instructor — Structured LLM Output with Validation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using Instructor — Structured LLM Output with Validation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/instructor/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/instructor/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/instructor/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Instructor — Structured LLM Output with Validation Compares

Feature / Agent	Instructor — Structured LLM Output with Validation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Instructor — Structured LLM Output with Validation

You are an expert in Instructor, the library for getting structured, validated output from LLMs. You help developers extract typed data from unstructured text using Pydantic models (Python) or Zod schemas (TypeScript), with automatic retries on validation failures, streaming partial objects, and support for OpenAI, Anthropic, Google, and local models — turning LLMs into reliable data extraction engines.

## Core Capabilities

### Python (Pydantic)

```python
# extraction.py — Type-safe LLM extraction
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Literal

client = instructor.from_openai(OpenAI())

class ContactInfo(BaseModel):
    name: str = Field(description="Full name of the person")
    email: str | None = Field(default=None, description="Email address if mentioned")
    phone: str | None = Field(default=None, description="Phone number if mentioned")
    company: str | None = Field(default=None)
    role: str | None = Field(default=None)

class ExtractedContacts(BaseModel):
    contacts: list[ContactInfo]
    confidence: float = Field(ge=0, le=1, description="Overall extraction confidence")

# Extract structured data — guaranteed to match schema
result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ExtractedContacts,
    messages=[{
        "role": "user",
        "content": """Extract contacts from this email:

Hi, I'm reaching out on behalf of Sarah Chen (sarah@techcorp.io),
VP of Engineering at TechCorp. She'd like to schedule a call.
You can also reach her at (415) 555-0123.

CC: Mike Johnson, mike.j@techcorp.io, Head of DevOps""",
    }],
    max_retries=3,                        # Auto-retry on validation failure
)

# result.contacts[0].name → "Sarah Chen"
# result.contacts[0].email → "sarah@techcorp.io"
# result.contacts[0].role → "VP of Engineering"
# Fully typed, validated by Pydantic

# Sentiment analysis with enum
class SentimentAnalysis(BaseModel):
    sentiment: Literal["positive", "negative", "neutral", "mixed"]
    emotions: list[Literal["joy", "anger", "sadness", "fear", "surprise", "disgust"]]
    key_phrases: list[str]
    summary: str

analysis = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=SentimentAnalysis,
    messages=[{"role": "user", "content": f"Analyze sentiment: {review_text}"}],
)

# Streaming partial objects
from instructor import Partial

for partial in client.chat.completions.create_partial(
    model="gpt-4o",
    response_model=ExtractedContacts,
    messages=[{"role": "user", "content": email_text}],
):
    # partial.contacts may be incomplete — render progressively
    print(f"Found {len(partial.contacts)} contacts so far...")
```

### TypeScript (Zod)

```typescript
import Instructor from "@instructor-ai/instructor";
import OpenAI from "openai";
import { z } from "zod";

const client = Instructor({ client: new OpenAI(), mode: "TOOLS" });

const ContactSchema = z.object({
  contacts: z.array(z.object({
    name: z.string(),
    email: z.string().email().nullable(),
    role: z.string().nullable(),
  })),
  confidence: z.number().min(0).max(1),
});

const result = await client.chat.completions.create({
  model: "gpt-4o-mini",
  response_model: { schema: ContactSchema, name: "ContactExtraction" },
  messages: [{ role: "user", content: emailText }],
  max_retries: 3,
});
// result is fully typed as z.infer<typeof ContactSchema>
```

### Multi-Provider

```python
# Works with any provider
from anthropic import Anthropic
import instructor

# Anthropic
client = instructor.from_anthropic(Anthropic())
result = client.messages.create(
    model="claude-sonnet-4-20250514",
    response_model=ExtractedContacts,
    messages=[{"role": "user", "content": text}],
    max_tokens=1024,
)

# Local models (Ollama)
from openai import OpenAI
client = instructor.from_openai(OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"), mode=instructor.Mode.JSON)
```

## Installation

```bash
pip install instructor                    # Python
npm install @instructor-ai/instructor zod  # TypeScript
```

## Best Practices

1. **Pydantic/Zod for schema** — Define exact output shape; LLM output is validated and typed automatically
2. **Field descriptions** — Add `description` to fields; helps the LLM understand what to extract
3. **max_retries** — Set to 2-3; Instructor auto-retries with validation error feedback when output doesn't match
4. **Literals for enums** — Use `Literal["a", "b"]` instead of `str` for categorical fields; constrains LLM output
5. **Nested models** — Use nested Pydantic models for complex structures; LLM handles hierarchical extraction
6. **Streaming** — Use `create_partial` for progressive rendering; show partial results as they arrive
7. **GPT-4o-mini for extraction** — Structured extraction doesn't need the smartest model; mini is 10x cheaper and fast
8. **Validation as feedback** — When validation fails, Instructor sends the error back to the LLM for self-correction

Related Skills

scanning-input-validation-practices

from ComeOnOliver/skillshub

This skill enables Claude to automatically scan source code for potential input validation vulnerabilities. It identifies areas where user-supplied data is not properly sanitized or validated before being used in operations, which could lead to security exploits like SQL injection, cross-site scripting (XSS), or command injection. Use this skill when the user asks to "scan for input validation issues", "check input sanitization", "find potential XSS vulnerabilities", or similar requests related to securing user input. It is particularly useful during code reviews, security audits, and when hardening applications against common web vulnerabilities. The skill leverages the input-validation-scanner plugin to perform the analysis.

input-validation-checker

from ComeOnOliver/skillshub

Input Validation Checker - Auto-activating skill for Security Fundamentals. Triggers on: input validation checker, input validation checker Part of the Security Fundamentals skill category.

cross-validation-setup

from ComeOnOliver/skillshub

Cross Validation Setup - Auto-activating skill for ML Training. Triggers on: cross validation setup, cross validation setup Part of the ML Training skill category.

structured-autonomy-plan

from ComeOnOliver/skillshub

Structured Autonomy Planning Prompt

structured-autonomy-implement

from ComeOnOliver/skillshub

Structured Autonomy Implementation Prompt

structured-autonomy-generate

from ComeOnOliver/skillshub

Structured Autonomy Implementation Generator Prompt

Exploitability Validation Skill

from ComeOnOliver/skillshub

A multi-stage pipeline for validating that vulnerability findings are real, reachable, and exploitable.

apify-generate-output-schema

from ComeOnOliver/skillshub

Generate output schemas (dataset_schema.json, output_schema.json, key_value_store_schema.json) for an Apify Actor by analyzing its source code. Use when creating or updating Actor output schemas.

deployment-validation-config-validate

from ComeOnOliver/skillshub

You are a configuration management expert specializing in validating, testing, and ensuring the correctness of application configurations. Create comprehensive validation schemas, implement configurat

generating-output-styles

from ComeOnOliver/skillshub

Creates custom output styles for Claude Code that modify system prompts and behavior. Use when the user asks to create output styles, customize Claude's response format, generate output-style files, or mentions output style configuration.

code-instructor

from ComeOnOliver/skillshub

Educational code development skill that teaches programming concepts while building applications. Use when the user wants to learn how code works, understand programming concepts, or build an app with detailed explanations. Provides line-by-line breakdowns, explains the 'why' behind code patterns, uses pedagogical teaching methods, and builds apps incrementally with educational commentary at each step.

global-validation

from ComeOnOliver/skillshub

Implement server-side validation with allowlists, specific error messages, type checking, and sanitization to prevent security vulnerabilities and ensure data integrity. Use this skill when creating or editing form request classes, when validating API inputs, when implementing validation rules in controllers or services, when writing client-side validation for user experience, when sanitizing user input to prevent injection attacks, when validating business rules, when implementing error message display, or when ensuring consistent validation across all application entry points.