api-reference

PDF-RAG API reference. REQUIRED after any failed curl/jq to localhost:8000 (404, null, jq error). Also use when uncertain about endpoint path or response shape.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

api-reference is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

PDF-RAG API reference. REQUIRED after any failed curl/jq to localhost:8000 (404, null, jq error). Also use when uncertain about endpoint path or response shape.

Teams using api-reference should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/api-reference/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/backend/api-reference/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/api-reference/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How api-reference Compares

Feature / Agent	api-reference	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

PDF-RAG API reference. REQUIRED after any failed curl/jq to localhost:8000 (404, null, jq error). Also use when uncertain about endpoint path or response shape.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# PDF-RAG API Skill

**Use this skill when:** curl/jq fails, response shape is wrong, or you're unsure of the endpoint.

## Cheat Sheet

### Response Paths
| Endpoint | jq path |
|----------|---------|
| parties | `.resolved.canonical_parties[].canonical_name` |
| dates | `.resolved.governing_date` |
| land | `.resolved.canonical_tracts[].tract_identity_phrase` |
| doc_type | `.resolved.doc_category` |
| conveyance | `.resolved.conveys_fee` |
| documents list | `.items[]` |
| packages list | `.packages[]` |
| package docs | `.documents[]` |
| chain edges | `.chains["fee"].dag.edges[]` |
| estate graph nodes | `.dag.nodes[]` |
| estate graph summary | `.summary` |
| estate graph gaps | `.dag.gaps[]` |

### Stage Names
```
extract_parties, extract_dates, extract_land_description, extract_conveyance, extract_document_classification
markdown, segmentation, assembly, runsheet, runtime_linking
```

## Shell Command Pattern

**IMPORTANT:** Never use shell variable interpolation in curl commands. Always inline values directly:

```bash
# WRONG - variable interpolation can fail
DOC_ID="abc123" && curl -s "http://localhost:8000/api/documents/$DOC_ID/parties"

# RIGHT - inline the value directly
curl -s "http://localhost:8000/api/documents/abc123/parties"
```

## Gotchas & Common Mistakes

### Estate Graph (L3)
- **GET vs POST**: `GET` retrieves stored graph, `POST` rebuilds from L0-L2 data
- **estate_type is lowercase**: `"fee"`, `"mineral"`, `"leasehold"` (not uppercase)
- **Chain = nodes grouped by tract**: A "mineral chain" = all mineral nodes sharing same `tract_id`
- **Quick summary**: `.summary.mineral_estates` gives count
- **List nodes by type**: `.dag.nodes[] | select(.estate_type == "mineral")`
- **Group into chains**: `[.dag.nodes[] | select(.estate_type == "mineral")] | group_by(.tract_id)`
- **Count chains**: Above query `| length`

### Rebuild Estate Graph (Exploratory Mode 1.1)

**Endpoint**: `POST /api/packages/{package_id}/estate-graph`

**Request body** (JSON):
```json
{"version": "1.1"}
```

**Full curl command**:
```bash
curl -s -X POST "http://localhost:8000/api/packages/{package_id}/estate-graph" \
  -H "Content-Type: application/json" \
  -d '{"version": "1.1"}'
```

**Version options**:
- `"1.0"`: Exploratory Layers - POC 5-layer fuzzy matcher (L0-L4 gates)
- `"1.1"`: Exploratory Signals - EdgeScore-based matcher (includes comparison field)
- `"2.0"`: Operative deterministic matcher (exact matches only)

**Response paths**:
- Nodes: `.dag.nodes[]`
- Edges: `.dag.edges[]`
- Gaps: `.dag.gaps[]`
- Summary: `.summary`
- Exploratory outcomes (v1.1): `.exploratory_outcomes`
- Winner/runner-up comparison (v1.1): `.exploratory_outcomes[estate_id].comparison`

**Comparison data (v1.1)**:
```
.exploratory_outcomes[estate_id].comparison:
  .winner_id: string
  .runner_up_id: string | null
  .rank_delta: number
  .top_winner_reasons[]: {signal, winner_state, runner_up_state, differentiation}
  .top_runner_up_losses[]: {signal, issue}
  .suppression_analysis: {eligible, blocking_contradictions[]}
```

### Chain Analysis (deprecated - use Estate Graph)
- **Use the endpoint, not Package.meta**: Chain analysis results are in `GET /packages/{id}/chain-analysis`, not buried in `Package.meta.chain_analysis`
- **L2 before L3**: Run L2 clustering (`POST /packages/{id}/l2/parties`) BEFORE chain analysis for proper cross-document party matching
- **Chain types are dict keys**: Access via `.chains["fee"]`, `.chains["mineral"]`, `.chains["leasehold"]`

### Extraction Endpoints
- **Always use `.resolved`**: The clean data is in `.resolved`, raw LLM output is in `.extraction` (debug mode only)
- **Debug mode**: Add `?debug=true` to get raw extraction + normalized + extraction_context
- **Party IDs are document-scoped**: `party_id` is unique within a document, use L2 clusters for cross-doc identity

### List Endpoints
- **Documents list**: `.items[]` (paginated response)
- **Packages list**: `.packages[]` (NOT `.[]`)
- **Package documents**: `.documents[]` (nested in package detail)
- **Chain edges**: `.chains["fee"].dag.edges[]`

### Processing
- **Stages array**: POST /api/process takes `stages: ["extract_parties", "extract_dates", ...]`
- **Poll for completion**: GET /api/jobs/{job_id} until status is "completed" or "failed"
- **enrich_only=true**: Re-run L1 enrichment without re-running L0 extraction (saves LLM cost)

---

**Need full schema details?** Read `.claude/skills/api-reference/API_REFERENCE.md`

Related Skills

functional-programming-preference

from diegosouzapw/awesome-omni-skill

Promotes functional programming and composition over inheritance while maintaining consistency with WordPress best practices.

asynchronous-programming-preference

from diegosouzapw/awesome-omni-skill

Favors the use of async and await for asynchronous programming in Python.

api-reference-documentation

from diegosouzapw/awesome-omni-skill

Creates professional API documentation using OpenAPI specifications with endpoints, authentication, and interactive examples. Use when documenting REST APIs, creating SDK references, or building developer portals.

claude-hooks-reference-2026

from diegosouzapw/awesome-omni-skill

Complete reference for Claude Code hooks system (January 2026). Use when creating hooks, understanding hook events, matchers, exit codes, JSON output control, environment variables, plugin hooks, or implementing hook scripts.

ai-model-reference

from diegosouzapw/awesome-omni-skill

AI 모델 API 호출명 및 가격 참조 가이드. API 키로 AI 모델을 호출할 때 정확한 모델명(model string)과 최신 가격 정보를 제공합니다. 사용 시점: (1) OpenAI, Anthropic, Google, DeepSeek 등의 API 호출 시 모델명이 필요할 때, (2) 토큰 비용/가격 비교가 필요할 때, (3) 최신 추론 모델/FAST 모델/가성비 모델 선택이 필요할 때, (4) 프롬프트 캐싱/배치 처리 비용 최적화가 필요할 때

agent-documenting-references

from diegosouzapw/awesome-omni-skill

Standardized reference documentation section structure for agents - project guidance, conventions, related agents, and Skills. Use when implementing or updating agent documentation.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

partner-revenue-desk

from diegosouzapw/awesome-omni-skill

Operating model for tracking, attributing, and accelerating partner-sourced revenue.

parallel-data-enrichment

from diegosouzapw/awesome-omni-skill

Structured company and entity data enrichment using Parallel AI Task API with core/base processors. Returns typed JSON output. No binary install — requires PARALLEL_API_KEY in .env.local.

parallel-agents

from diegosouzapw/awesome-omni-skill

Multi-agent orchestration patterns. Use when multiple independent tasks can run with different domain expertise or when comprehensive analysis requires multiple perspectives.

paper-writing-assistant

from diegosouzapw/awesome-omni-skill

Assist in drafting research papers and meeting notes, enforcing academic rigor and formatting.

pandas-data-manipulation-rules

from diegosouzapw/awesome-omni-skill

Focuses on pandas-specific rules for data manipulation, including method chaining, data selection using loc/iloc, and groupby operations.