api-reference
PDF-RAG API reference. REQUIRED after any failed curl/jq to localhost:8000 (404, null, jq error). Also use when uncertain about endpoint path or response shape.
Best use case
api-reference is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
PDF-RAG API reference. REQUIRED after any failed curl/jq to localhost:8000 (404, null, jq error). Also use when uncertain about endpoint path or response shape.
Teams using api-reference should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/api-reference/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How api-reference Compares
| Feature / Agent | api-reference | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
PDF-RAG API reference. REQUIRED after any failed curl/jq to localhost:8000 (404, null, jq error). Also use when uncertain about endpoint path or response shape.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# PDF-RAG API Skill
**Use this skill when:** curl/jq fails, response shape is wrong, or you're unsure of the endpoint.
## Cheat Sheet
### Response Paths
| Endpoint | jq path |
|----------|---------|
| parties | `.resolved.canonical_parties[].canonical_name` |
| dates | `.resolved.governing_date` |
| land | `.resolved.canonical_tracts[].tract_identity_phrase` |
| doc_type | `.resolved.doc_category` |
| conveyance | `.resolved.conveys_fee` |
| documents list | `.items[]` |
| packages list | `.packages[]` |
| package docs | `.documents[]` |
| chain edges | `.chains["fee"].dag.edges[]` |
| estate graph nodes | `.dag.nodes[]` |
| estate graph summary | `.summary` |
| estate graph gaps | `.dag.gaps[]` |
### Stage Names
```
extract_parties, extract_dates, extract_land_description, extract_conveyance, extract_document_classification
markdown, segmentation, assembly, runsheet, runtime_linking
```
## Shell Command Pattern
**IMPORTANT:** Never use shell variable interpolation in curl commands. Always inline values directly:
```bash
# WRONG - variable interpolation can fail
DOC_ID="abc123" && curl -s "http://localhost:8000/api/documents/$DOC_ID/parties"
# RIGHT - inline the value directly
curl -s "http://localhost:8000/api/documents/abc123/parties"
```
## Gotchas & Common Mistakes
### Estate Graph (L3)
- **GET vs POST**: `GET` retrieves stored graph, `POST` rebuilds from L0-L2 data
- **estate_type is lowercase**: `"fee"`, `"mineral"`, `"leasehold"` (not uppercase)
- **Chain = nodes grouped by tract**: A "mineral chain" = all mineral nodes sharing same `tract_id`
- **Quick summary**: `.summary.mineral_estates` gives count
- **List nodes by type**: `.dag.nodes[] | select(.estate_type == "mineral")`
- **Group into chains**: `[.dag.nodes[] | select(.estate_type == "mineral")] | group_by(.tract_id)`
- **Count chains**: Above query `| length`
### Rebuild Estate Graph (Exploratory Mode 1.1)
**Endpoint**: `POST /api/packages/{package_id}/estate-graph`
**Request body** (JSON):
```json
{"version": "1.1"}
```
**Full curl command**:
```bash
curl -s -X POST "http://localhost:8000/api/packages/{package_id}/estate-graph" \
-H "Content-Type: application/json" \
-d '{"version": "1.1"}'
```
**Version options**:
- `"1.0"`: Exploratory Layers - POC 5-layer fuzzy matcher (L0-L4 gates)
- `"1.1"`: Exploratory Signals - EdgeScore-based matcher (includes comparison field)
- `"2.0"`: Operative deterministic matcher (exact matches only)
**Response paths**:
- Nodes: `.dag.nodes[]`
- Edges: `.dag.edges[]`
- Gaps: `.dag.gaps[]`
- Summary: `.summary`
- Exploratory outcomes (v1.1): `.exploratory_outcomes`
- Winner/runner-up comparison (v1.1): `.exploratory_outcomes[estate_id].comparison`
**Comparison data (v1.1)**:
```
.exploratory_outcomes[estate_id].comparison:
.winner_id: string
.runner_up_id: string | null
.rank_delta: number
.top_winner_reasons[]: {signal, winner_state, runner_up_state, differentiation}
.top_runner_up_losses[]: {signal, issue}
.suppression_analysis: {eligible, blocking_contradictions[]}
```
### Chain Analysis (deprecated - use Estate Graph)
- **Use the endpoint, not Package.meta**: Chain analysis results are in `GET /packages/{id}/chain-analysis`, not buried in `Package.meta.chain_analysis`
- **L2 before L3**: Run L2 clustering (`POST /packages/{id}/l2/parties`) BEFORE chain analysis for proper cross-document party matching
- **Chain types are dict keys**: Access via `.chains["fee"]`, `.chains["mineral"]`, `.chains["leasehold"]`
### Extraction Endpoints
- **Always use `.resolved`**: The clean data is in `.resolved`, raw LLM output is in `.extraction` (debug mode only)
- **Debug mode**: Add `?debug=true` to get raw extraction + normalized + extraction_context
- **Party IDs are document-scoped**: `party_id` is unique within a document, use L2 clusters for cross-doc identity
### List Endpoints
- **Documents list**: `.items[]` (paginated response)
- **Packages list**: `.packages[]` (NOT `.[]`)
- **Package documents**: `.documents[]` (nested in package detail)
- **Chain edges**: `.chains["fee"].dag.edges[]`
### Processing
- **Stages array**: POST /api/process takes `stages: ["extract_parties", "extract_dates", ...]`
- **Poll for completion**: GET /api/jobs/{job_id} until status is "completed" or "failed"
- **enrich_only=true**: Re-run L1 enrichment without re-running L0 extraction (saves LLM cost)
---
**Need full schema details?** Read `.claude/skills/api-reference/API_REFERENCE.md`Related Skills
functional-programming-preference
Promotes functional programming and composition over inheritance while maintaining consistency with WordPress best practices.
asynchronous-programming-preference
Favors the use of async and await for asynchronous programming in Python.
api-reference-documentation
Creates professional API documentation using OpenAPI specifications with endpoints, authentication, and interactive examples. Use when documenting REST APIs, creating SDK references, or building developer portals.
claude-hooks-reference-2026
Complete reference for Claude Code hooks system (January 2026). Use when creating hooks, understanding hook events, matchers, exit codes, JSON output control, environment variables, plugin hooks, or implementing hook scripts.
ai-model-reference
AI 모델 API 호출명 및 가격 참조 가이드. API 키로 AI 모델을 호출할 때 정확한 모델명(model string)과 최신 가격 정보를 제공합니다. 사용 시점: (1) OpenAI, Anthropic, Google, DeepSeek 등의 API 호출 시 모델명이 필요할 때, (2) 토큰 비용/가격 비교가 필요할 때, (3) 최신 추론 모델/FAST 모델/가성비 모델 선택이 필요할 때, (4) 프롬프트 캐싱/배치 처리 비용 최적화가 필요할 때
agent-documenting-references
Standardized reference documentation section structure for agents - project guidance, conventions, related agents, and Skills. Use when implementing or updating agent documentation.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
partner-revenue-desk
Operating model for tracking, attributing, and accelerating partner-sourced revenue.
parallel-data-enrichment
Structured company and entity data enrichment using Parallel AI Task API with core/base processors. Returns typed JSON output. No binary install — requires PARALLEL_API_KEY in .env.local.
parallel-agents
Multi-agent orchestration patterns. Use when multiple independent tasks can run with different domain expertise or when comprehensive analysis requires multiple perspectives.
paper-writing-assistant
Assist in drafting research papers and meeting notes, enforcing academic rigor and formatting.
pandas-data-manipulation-rules
Focuses on pandas-specific rules for data manipulation, including method chaining, data selection using loc/iloc, and groupby operations.