extract-transcripts
Extract readable transcripts from Claude Code and Codex CLI session JSONL files
Best use case
extract-transcripts is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Extract readable transcripts from Claude Code and Codex CLI session JSONL files
Teams using extract-transcripts should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/extract-transcripts/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How extract-transcripts Compares
| Feature / Agent | extract-transcripts | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Extract readable transcripts from Claude Code and Codex CLI session JSONL files
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Extract Transcripts Extracts readable markdown transcripts from Claude Code and Codex CLI session JSONL files. ## Scripts ### Claude Code Sessions ```bash # Extract a single session python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> # With tool calls and thinking blocks python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --include-tools --include-thinking # Extract all sessions from a directory python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all # Output to file python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> -o output.md # Summary only (quick overview) python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --summary # Skip empty/warmup-only sessions python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all --skip-empty ``` **Options:** - `--include-tools`: Include tool calls and results - `--include-thinking`: Include Claude's thinking blocks - `--all`: Process all .jsonl files in directory - `-o, --output`: Output file path (default: stdout) - `--summary`: Only output brief summary - `--skip-empty`: Skip empty and warmup-only sessions - `--min-messages N`: Minimum messages for --skip-empty (default: 2) ### Codex CLI Sessions ```bash # Extract a Codex session python3 ~/.claude/skills/extract-transcripts/extract_codex_transcript.py <session.jsonl> # Extract from Codex history file python3 ~/.claude/skills/extract-transcripts/extract_codex_transcript.py ~/.codex/history.jsonl --history ``` ## Session File Locations ### Claude Code - Sessions: `~/.claude/projects/<project-path>/<session-id>.jsonl` ### Codex CLI - Sessions: `~/.codex/sessions/<session_id>/rollout.jsonl` - History: `~/.codex/history.jsonl` ## DuckDB-Based Transcript Index For querying across many sessions, use the DuckDB-based indexer: ```bash # Index all sessions (incremental - only new/changed files) python3 ~/.claude/skills/extract-transcripts/transcript_index.py index # Force full reindex python3 ~/.claude/skills/extract-transcripts/transcript_index.py index --full # Limit number of files to process python3 ~/.claude/skills/extract-transcripts/transcript_index.py index --limit 10 # List recent sessions python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent --limit 20 python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent --project myapp python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent --since 7d # Search across sessions python3 ~/.claude/skills/extract-transcripts/transcript_index.py search "error handling" python3 ~/.claude/skills/extract-transcripts/transcript_index.py search "query" --cwd ~/myproject # Show a session transcript python3 ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> python3 ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> --summary ``` **Requirements:** DuckDB (`pip install duckdb`) **Database location:** `~/.claude/transcript-index/sessions.duckdb` ## Output Format Transcripts are formatted as markdown with: - Session metadata (date, duration, model, working directory, git branch) - User messages prefixed with `## User` - Assistant responses prefixed with `## Assistant` - Tool calls in code blocks (if --include-tools) - Thinking in blockquotes (if --include-thinking) - Tool usage summary for Codex sessions
Related Skills
metadata-extractor
Metadata Extractor - Auto-activating skill for Data Pipelines. Triggers on: metadata extractor, metadata extractor Part of the Data Pipelines skill category.
extraction-proposer
Scan ICE-Crawler extraction logs, pick promising algorithms/tools, and emit skill creation proposals (name, scope, source files, next steps).
java-refactoring-extract-method
Refactoring using Extract Methods in Java Language
security-requirement-extraction
Derive security requirements from threat models and business context. Use when translating threats into actionable requirements, creating security user stories, or building security test cases.
extract
Extract and consolidate reusable components, design tokens, and patterns into your design system. Identifies opportunities for systematic reuse and enriches your component library.
control-loop-extraction
Extract and analyze agent reasoning loops, step functions, and termination conditions. Use when needing to (1) understand how an agent framework implements reasoning (ReAct, Plan-and-Solve, Reflection, etc.), (2) locate the core decision-making logic, (3) analyze loop mechanics and termination conditions, (4) document the step-by-step execution flow of an agent, or (5) compare reasoning patterns across frameworks.
star-story-extraction
Auto-invoke after task completion to extract interview-ready STAR stories from completed work.
resume-bullet-extraction
Auto-invoke after task completion to generate powerful resume bullet points from completed work.
design-spec-extraction
Extract comprehensive JSON design specifications from visual sources including Figma exports, UI mockups, screenshots, or live website captures. Produces W3C DTCG-compliant output with component trees, suitable for code generation, design documentation, and developer handoff.
standards-extraction
Extract coding standards and conventions from CONTRIBUTING.md, .editorconfig, linter configs. Use for onboarding and ensuring consistent contributions.
extract-from-pdfs
This skill should be used when extracting structured data from scientific PDFs for systematic reviews, meta-analyses, or database creation. Use when working with collections of research papers that need to be converted into analyzable datasets with validation metrics.
epub-chapter-extractor
Extract all chapters from an EPUB file into separate markdown files. Use when the user wants to split an EPUB into individual chapter files, extract EPUB chapters, or convert an ebook to separate markdown documents.