Documents

Read, write, convert, and analyze documents — routes to PDF, DOCX, XLSX, PPTX sub-skills for creation, editing, extraction, and format conversion. USE WHEN document, process file, create document, convert format, extract text, PDF, DOCX, XLSX, PPTX, Word, Excel, spreadsheet, PowerPoint, presentation, slides, consulting report, large PDF, merge PDF, fill form, tracked changes, redlining.

11,146 stars

Best use case

Documents is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Read, write, convert, and analyze documents — routes to PDF, DOCX, XLSX, PPTX sub-skills for creation, editing, extraction, and format conversion. USE WHEN document, process file, create document, convert format, extract text, PDF, DOCX, XLSX, PPTX, Word, Excel, spreadsheet, PowerPoint, presentation, slides, consulting report, large PDF, merge PDF, fill form, tracked changes, redlining.

Teams using Documents should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/Documents/SKILL.md --create-dirs "https://raw.githubusercontent.com/danielmiessler/Personal_AI_Infrastructure/main/Packs/Utilities/src/Documents/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/Documents/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How Documents Compares

Feature / AgentDocumentsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Read, write, convert, and analyze documents — routes to PDF, DOCX, XLSX, PPTX sub-skills for creation, editing, extraction, and format conversion. USE WHEN document, process file, create document, convert format, extract text, PDF, DOCX, XLSX, PPTX, Word, Excel, spreadsheet, PowerPoint, presentation, slides, consulting report, large PDF, merge PDF, fill form, tracked changes, redlining.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

## Customization

**Before executing, check for user customizations at:**
`~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/Documents/`

If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.


## 🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)

**You MUST send this notification BEFORE doing anything else when this skill is invoked.**

1. **Send voice notification**:
   ```bash
   curl -s -X POST http://localhost:8888/notify \
     -H "Content-Type: application/json" \
     -d '{"message": "Running the WORKFLOWNAME workflow in the Documents skill to ACTION"}' \
     > /dev/null 2>&1 &
   ```

2. **Output text notification**:
   ```
   Running the **WorkflowName** workflow in the **Documents** skill to ACTION...
   ```

**This is not optional. Execute this curl command immediately upon skill invocation.**

# Documents Skill

## 🎯 Load Full PAI Context

**Before starting any task with this skill, load complete PAI context:**

`read ~/.claude/PAI/SKILL.md`


## When to Activate This Skill

### Word Documents (DOCX)
- User wants to create, edit, or analyze Word documents
- User mentions "tracked changes", "redlining", "document review"
- User needs to convert documents to other formats
- User wants to work with document structure, comments, or formatting

### PDF Files
- User wants to create, merge, split, or manipulate PDFs
- User mentions "extract text from PDF", "PDF tables", "fill PDF form"
- User needs to convert PDFs to/from other formats
- User wants to add watermarks, passwords, or extract images

### PowerPoint Presentations (PPTX)
- User wants to create or edit presentations
- User mentions "slides", "presentation template", "speaker notes"
- User needs to convert presentations to other formats
- User wants to work with slide layouts or design elements

### Excel Spreadsheets (XLSX)
- User wants to create or edit spreadsheets
- User mentions "formulas", "financial model", "data analysis"
- User needs to work with Excel tables, charts, or pivot tables
- User wants to convert spreadsheets to/from other formats

## Workflow Routing

| Request Pattern | Route To |
|---|---|
| Consulting report, McKinsey report, assessment report, professional PDF | `Workflows/ConsultingReport.md` |
| Large PDF, process big PDF, Gemini PDF | `Workflows/ProcessLargePdfGemini3.md` |
| Word document, DOCX, create docx, edit docx, tracked changes, redlining | `Docx/SKILL.md` |
| PDF, create PDF, merge PDF, split PDF, extract text from PDF, fill form | `Pdf/SKILL.md` |
| Presentation, PPTX, slides, PowerPoint, speaker notes | `Pptx/SKILL.md` |
| Spreadsheet, XLSX, Excel, formulas, financial model, data analysis | `Xlsx/SKILL.md` |

## Document Type Details

This skill organizes document processing across 4 document types plus specialized workflows:

### Consulting Reports (HTML + Playwright PDF)

**Reference Documentation:**
- `Workflows/ConsultingReport.md` - Complete consulting report generation workflow

**Routing Logic:**
- "Create consulting report", "generate report PDF" → ConsultingReport workflow
- "Build assessment report", "strategic assessment" → ConsultingReport workflow
- "McKinsey-style report", "professional report PDF" → ConsultingReport workflow

**Pipeline:** Report Artifacts → Structured HTML → Playwright PDF

**Key Capabilities:**
- Parse report directories with mixed content (markdown, TypeScript data, images)
- Professional CSS typography (Georgia serif body, Inter sans headings)
- Color-coded callout boxes (red/amber/green) and severity badges
- Auto-generated linked Table of Contents
- Cover page with classification marking
- Headers/footers with CONFIDENTIAL and page numbers
- Image compression pipeline (PNG → JPEG, max 1200px)
- A4 format with Playwright for pixel-perfect PDF output

**Reference Implementation:** Quorum Cyber assessment at `/path/to/reference/generate-pdf.mjs`


### Word Documents (DOCX)

**Reference Documentation:**
- `docx/SKILL.md` - Complete DOCX processing guide
- `docx/docx-js.md` - Creating new documents with JavaScript
- `docx/ooxml.md` - Editing existing documents with OOXML

**Routing Logic:**
- "Create Word document", "new docx" → Create workflow (docx-js)
- "Edit Word document", "tracked changes", "redlining" → Edit workflow (OOXML)
- "Read Word document", "extract text from docx" → Read workflow (pandoc)
- "Document review", "track changes" → Redlining workflow

**Supporting Resources:**
- Scripts: `~/.claude/skills/Utilities/Documents/Docx/Scripts/`
- OOXML tools: `~/.claude/skills/Utilities/Documents/Docx/ooxml/`
- License: `~/.claude/skills/Utilities/Documents/Docx/LICENSE.txt`

**Key Capabilities:**
- Create professional documents with docx-js
- Edit with tracked changes (redlining workflow)
- Extract text/comments with pandoc
- Convert to images for visual inspection
- Work with raw OOXML for advanced features

### PDF Processing

**Reference Documentation:**
- `pdf/SKILL.md` - Complete PDF processing guide
- `pdf/forms.md` - Filling PDF forms
- `pdf/reference.md` - Advanced features and troubleshooting

**Routing Logic:**
- "Create PDF" → Creation workflow (reportlab)
- "Merge PDFs", "split PDF" → Manipulation workflow (pypdf)
- "Extract text from PDF" → Extraction workflow (pdfplumber)
- "Fill PDF form" → Forms workflow (pdf-lib or pypdf)
- "Extract tables from PDF" → Table extraction (pdfplumber + pandas)

**Supporting Resources:**
- Scripts: `~/.claude/skills/Utilities/Documents/Pdf/Scripts/`
- License: `~/.claude/skills/Utilities/Documents/Pdf/LICENSE.txt`

**Key Capabilities:**
- Create PDFs with reportlab
- Extract text/tables with pdfplumber
- Merge/split with pypdf or qpdf
- Fill forms programmatically
- Add watermarks and password protection
- Extract images from PDFs

### PowerPoint Presentations (PPTX)

**Reference Documentation:**
- `pptx/SKILL.md` - Complete PPTX processing guide
- `pptx/html2pptx.md` - Creating presentations from HTML
- `pptx/ooxml.md` - Editing existing presentations

**Routing Logic:**
- "Create presentation", "new slides" → Creation workflow (html2pptx)
- "Edit presentation", "modify slides" → Edit workflow (OOXML)
- "Use presentation template" → Template workflow
- "Extract slide text" → Read workflow (markitdown)
- "Create thumbnail grid" → Thumbnail workflow

**Supporting Resources:**
- Scripts: `~/.claude/skills/Utilities/Documents/Pptx/Scripts/`
- OOXML tools: `~/.claude/skills/Utilities/Documents/Pptx/ooxml/`
- License: `~/.claude/skills/Utilities/Documents/Pptx/LICENSE.txt`

**Key Capabilities:**
- Create presentations with html2pptx (HTML → PPTX)
- Professional design with color palettes and layouts
- Edit with OOXML for advanced features
- Work with templates (rearrange, inventory, replace)
- Generate thumbnail grids for visual analysis
- Convert to images for inspection

### Excel Spreadsheets (XLSX)

**Reference Documentation:**
- `xlsx/SKILL.md` - Complete XLSX processing guide
- `xlsx/recalc.py` - Formula recalculation script

**Routing Logic:**
- "Create spreadsheet", "new Excel file" → Creation workflow (openpyxl)
- "Edit spreadsheet", "modify Excel" → Edit workflow (openpyxl)
- "Analyze data", "read Excel" → Analysis workflow (pandas)
- "Financial model", "formulas" → Financial modeling workflow
- "Recalculate formulas" → Recalculation workflow (recalc.py)

**Supporting Resources:**
- Recalc script: `~/.claude/skills/Utilities/Documents/Xlsx/recalc.py`
- License: `~/.claude/skills/Utilities/Documents/Xlsx/LICENSE.txt`

**Key Capabilities:**
- Create spreadsheets with formulas (openpyxl)
- Data analysis with pandas
- Financial modeling with color coding standards
- Formula recalculation with LibreOffice
- Error detection and validation
- Preserve formatting and formulas when editing

## 📋 Document Processing Principles

### DOCX Best Practices
1. **Tracked Changes** - Use redlining workflow for professional document review
2. **Minimal Edits** - Only mark text that actually changes, preserve original RSIDs
3. **Batch Changes** - Group related edits (3-10 changes) for efficient processing
4. **Verification** - Always convert to markdown to verify changes applied correctly

### PDF Best Practices
1. **Library Selection** - pypdf for basic ops, pdfplumber for text/tables, reportlab for creation
2. **OCR for Scanned** - Use pytesseract + pdf2image for scanned documents
3. **Form Filling** - Follow forms.md for programmatic form completion
4. **Command Line** - Use qpdf/pdftotext for simple operations

### PPTX Best Practices
1. **Design First** - Analyze content and choose appropriate colors/layouts before coding
2. **Web-Safe Fonts** - Only use web-safe fonts (Arial, Helvetica, Times, etc.)
3. **Visual Verification** - Always generate thumbnails to inspect layout issues
4. **Template Analysis** - Create inventory before using templates to understand structure

### XLSX Best Practices
1. **Use Formulas** - ALWAYS use Excel formulas, NEVER hardcode calculated values
2. **Zero Errors** - Deliver with zero formula errors (#REF!, #DIV/0!, etc.)
3. **Recalculate** - Run recalc.py after creating/editing to update formula values
4. **Financial Standards** - Follow color coding (blue inputs, black formulas, green links)

## Examples

**Example 1: Create proposal with tracked changes**
```
User: "Create a consulting proposal doc with redlining"
→ Routes to DOCX workflows
→ Creates document with docx-js
→ Enables tracked changes for review workflow
→ Outputs professional .docx with revision marks
```

**Example 2: Fill a PDF form programmatically**
```
User: "Fill out this NDA PDF with my info"
→ Routes to PDF workflows
→ Reads form fields from PDF
→ Fills fields programmatically with pdf-lib
→ Outputs completed, flattened PDF
```

**Example 3: Build financial model spreadsheet**
```
User: "Create a revenue projection spreadsheet"
→ Routes to XLSX workflows
→ Creates workbook with openpyxl
→ Adds formulas (never hardcoded values)
→ Runs recalc.py to update calculations
```

**Example 4: Generate professional consulting report PDF**
```
User: "Create a consulting report from the assessment data"
→ Routes to ConsultingReport workflow
→ Parses report directory for data files, markdown, diagrams
→ Compresses images (PNG→JPEG, max 1200px)
→ Generates styled HTML with professional typography
→ Converts to PDF via Playwright with headers/footers
→ Outputs McKinsey-quality A4 PDF with TOC, diagrams, color boxes
```

## 🔗 Integration with Other Skills

### Feeds Into:
- **writing** skill - Creating documents for blog posts and newsletters
- **business** skill - Creating consulting proposals and financial models
- **research** skill - Extracting data from research documents

### Uses:
- **media** skill - Creating images for document illustrations
- **development** skill - Building document processing automation
- **system** skill - Command-line tools and scripting

## 🎯 Key Principles

### Document Creation
1. **Quality First** - Professional formatting and structure from the start
2. **Template Reuse** - Leverage existing templates when available
3. **Validation** - Always verify output (visual inspection, error checking)
4. **Automation** - Use scripts for repetitive tasks

### Document Editing
1. **Preserve Intent** - Maintain original formatting and structure
2. **Track Changes** - Use proper workflows for document review
3. **Batch Processing** - Group related operations for efficiency
4. **Error Prevention** - Validate before finalizing

### Document Analysis
1. **Right Tool** - Choose appropriate library/tool for the task
2. **Data Integrity** - Preserve original data when extracting/converting
3. **Format Awareness** - Understand document structure (OOXML, PDF structure, etc.)
4. **Performance** - Use efficient methods for large documents

## 📚 Full Reference Documentation

**Word Documents (DOCX):**
- Main Guide: `~/.claude/skills/Utilities/Documents/Docx/SKILL.md`
- Creation Reference: `~/.claude/skills/Utilities/Documents/Docx/docx-js.md`
- Editing Reference: `~/.claude/skills/Utilities/Documents/Docx/ooxml.md`

**PDF Processing:**
- Main Guide: `~/.claude/skills/Utilities/Documents/Pdf/SKILL.md`
- Forms Guide: `~/.claude/skills/Utilities/Documents/Pdf/forms.md`
- Advanced Reference: `~/.claude/skills/Utilities/Documents/Pdf/reference.md`

**PowerPoint Presentations (PPTX):**
- Main Guide: `~/.claude/skills/Utilities/Documents/Pptx/SKILL.md`
- Creation Reference: `~/.claude/skills/Utilities/Documents/Pptx/html2pptx.md`
- Editing Reference: `~/.claude/skills/Utilities/Documents/Pptx/ooxml.md`

**Excel Spreadsheets (XLSX):**
- Main Guide: `~/.claude/skills/Utilities/Documents/Xlsx/SKILL.md`
- Recalc Script: `~/.claude/skills/Utilities/Documents/Xlsx/recalc.py`

---

## Summary

**The documents skill provides comprehensive document processing:**

- **DOCX** - Create, edit, analyze Word documents with tracked changes support
- **PDF** - Create, manipulate, extract from PDFs with form filling capabilities
- **PPTX** - Create, edit presentations with professional design and templates
- **XLSX** - Create, edit spreadsheets with formulas and financial modeling

**Reference-based organization** - Each document type has complete guides and tooling

**Routing is automatic** - Analyzes user intent and activates appropriate document type workflow

**Professional quality** - Standards and best practices for production-ready documents

Related Skills

Utilities

11146
from danielmiessler/Personal_AI_Infrastructure

Developer utilities and tools — CLI generation, skill scaffolding, agent delegation, system upgrades, evals, documents, parsing, audio editing, Fabric patterns, Cloudflare infrastructure, browser automation, meta-prompting, and aphorisms. USE WHEN create CLI, build CLI, command-line tool, wrap API, add command, upgrade tier, TypeScript CLI, create skill, new skill, scaffold skill, validate skill, update skill, fix skill structure, canonicalize skill, parallel execution, agent teams, delegate, workstreams, swarm, upgrade, improve system, system upgrade, check Anthropic, algorithm upgrade, mine reflections, find sources, research upgrade, PAI upgrade, eval, evaluate, test agent, benchmark, verify behavior, regression test, capability test, run eval, compare models, compare prompts, create judge, view results, document, process file, create document, convert format, extract text, PDF, DOCX, XLSX, PPTX, Word, Excel, spreadsheet, PowerPoint, presentation, slides, consulting report, large PDF, merge PDF, fill form, tracked changes, redlining, parse, extract, URL, transcript, entities, JSON, batch, YouTube, article, newsletter, Twitter, browser extension, collision detection, detect content type, extract article, extract newsletter, extract YouTube, extract PDF, parse content, clean audio, edit audio, remove filler words, clean podcast, remove ums, cut dead air, polish audio, transcribe, analyze audio, audio pipeline, fabric, fabric pattern, run fabric, update patterns, sync fabric, summarize, threat model pattern, Cloudflare, worker, deploy, Pages, MCP server, wrangler, DNS, KV, R2, D1, Vectorize, browser, screenshot, debug web, verify UI, troubleshoot frontend, automate browser, browse website, review stories, run stories, web automation, meta-prompting, template generation, prompt optimization, programmatic prompt, render template, validate template, prompt engineering, aphorism, quote, saying, find quote, research thinker, newsletter quotes, add aphorism, search aphorisms.

ContentAnalysis

11146
from danielmiessler/Personal_AI_Infrastructure

Content extraction and analysis — wisdom extraction from videos, podcasts, articles, and YouTube. USE WHEN extract wisdom, content analysis, analyze content, insight report, analyze video, analyze podcast, extract insights, key takeaways, what did I miss, extract from YouTube.

WriteStory

11146
from danielmiessler/Personal_AI_Infrastructure

Layered fiction writing system using Will Storr's storytelling science and rhetorical figures. USE WHEN write story, fiction, novel, short story, book, chapter, story bible, character arc, plot outline, creative writing, worldbuilding, narrative, mystery writing, dialogue, prose, series planning.

USMetrics

11146
from danielmiessler/Personal_AI_Infrastructure

US economic indicators. USE WHEN GDP, inflation, unemployment, economic metrics, gas prices. SkillSearch('usmetrics') for docs.

Sales

11146
from danielmiessler/Personal_AI_Infrastructure

Sales workflows. USE WHEN sales, proposal, pricing. SkillSearch('sales') for docs.

PAI

11146
from danielmiessler/Personal_AI_Infrastructure

Personal AI Infrastructure core. The authoritative reference for how PAI works.

VoiceServer

11146
from danielmiessler/Personal_AI_Infrastructure

Voice server management. USE WHEN voice server, TTS server, voice notification, prosody.

THEALGORITHM

11146
from danielmiessler/Personal_AI_Infrastructure

Universal execution engine using scientific method to achieve ideal state. USE WHEN complex tasks, multi-step work, "run the algorithm", "use the algorithm", OR any non-trivial request that benefits from structured execution with ISC (Ideal State Criteria) tracking.

System

11146
from danielmiessler/Personal_AI_Infrastructure

System maintenance with three core operations - integrity check (find/fix broken references), document session (current transcript), document recent (catch-up since last update). Plus security workflows. USE WHEN integrity check, audit system, document session, document this session, document today, document recent, catch up docs, what's undocumented, check for secrets, security scan, privacy check, OR asking about past work ("we just worked on", "remember when we").

CORE

11146
from danielmiessler/Personal_AI_Infrastructure

Personal AI Infrastructure core. AUTO-LOADS at session start. The authoritative reference for how the PAI system works, how to use it, and all system-level configuration. USE WHEN any session begins, user asks about the system, identity, configuration, workflows, security, or any other question about how the PAI system operates.

thinking

11146
from danielmiessler/Personal_AI_Infrastructure

Multi-mode analytical and creative thinking — first principles decomposition, iterative depth analysis, creative brainstorming, multi-agent council debates, adversarial red teaming, world threat modeling, and scientific hypothesis testing. USE WHEN first principles, decompose, deconstruct, reconstruct, challenge assumptions, iterative depth, multi-angle, deep exploration, be creative, brainstorm, divergent ideas, tree of thoughts, maximum creativity, technical creativity, idea generation, domain specific, council, debate, perspectives, quick consensus, red team, critique, stress test, adversarial validation, parallel analysis, devil's advocate, threat model, world model, future analysis, test idea, test investment, update models, view models, time horizon, think about, figure out, experiment, iterate, science, hypothesis, define goal, design experiment, quick diagnosis, structured investigation, full cycle.

telos

11146
from danielmiessler/Personal_AI_Infrastructure

Life OS and project analysis. USE WHEN TELOS, life goals, projects, dependencies, books, movies. SkillSearch('telos') for docs.