api-server-mcp

api server mcp

7,385 stars

Best use case

api-server-mcp is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

api server mcp

Teams using api-server-mcp should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/api-server-mcp/SKILL.md --create-dirs "https://raw.githubusercontent.com/kreuzberg-dev/kreuzberg/main/.ai-rulez/skills/api-server-mcp/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/api-server-mcp/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How api-server-mcp Compares

Feature / Agentapi-server-mcpStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

api server mcp

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# API Server & MCP Protocol

**Axum server design for document extraction endpoints, middleware, async processing, and Model Context Protocol integration for AI agents**

## Kreuzberg API Architecture

**Location**: `crates/kreuzberg/src/api/`, `crates/kreuzberg-cli/`

Kreuzberg provides a dual REST API + MCP server built with Axum + Tokio.

```text
Request Flow:
HTTP Client / AI Agent (Claude)
    |
[Transport Layer]
├── REST API (Axum HTTP)
└── MCP Protocol (HTTP or Stdio)
    |
[Middleware Layer]
├── CORS, Request Logging (TraceLayer)
├── Request/Response size limits
└── Rate limiting (optional)
    |
[Router]
├── REST Endpoints
│   ├── POST /extract - File upload extraction
│   ├── POST /extract-url - URL-based extraction
│   ├── GET /formats - List supported formats
│   ├── GET /health - Server health check
│   ├── POST /batch - Batch document processing
│   ├── GET /cache/stats - Cache statistics
│   └── DELETE /cache - Clear extraction cache
├── MCP Endpoints
│   ├── POST /mcp/tools - List available tools
│   ├── POST /mcp/tools/call - Call a tool
│   ├── GET /mcp/resources - List resources
│   ├── GET /mcp/resources/:uri - Read resource
│   ├── GET /mcp/prompts - List prompts
│   └── GET /mcp/prompts/:name - Get prompt
    |
[Handler / Tool Layer]
├── extract_handler / extract_file tool
├── batch_handler / batch_extract tool
├── health_handler / get_capabilities tool
└── format_handler
    |
[Extraction Core]
├── Format detection
├── Extraction pipeline
├── Post-processing (chunking, embeddings)
└── Result formatting
    |
JSON Response / MCP ToolResult
```

## Server Setup & Configuration

**Location**: `crates/kreuzberg/src/api/server.rs`

Server initialization pattern: Create `ApiState` (holds `ExtractionConfig` + `ExtractionCache`), build Axum `Router` with all REST + MCP routes, apply middleware layers (body limits, CORS, tracing), serve via `tokio::net::TcpListener`.

Key middleware layers applied in order:

- `DefaultBodyLimit::max(100MB)` + `RequestBodyLimitLayer` -- configurable via env vars
- `CorsLayer::permissive()` -- restrict in production via `CORS_ALLOWED_ORIGINS`
- `TraceLayer::new_for_http()` -- request/response logging

## Core REST Handlers

**Location**: `crates/kreuzberg/src/api/handlers.rs`

| Handler | Method | Description |
|---------|--------|-------------|
| `extract_handler` | POST /extract | Multipart upload: parse file + optional config JSON, check cache, call `extract_bytes()`, cache result |
| `extract_url_handler` | POST /extract-url | Fetch URL via reqwest, extract bytes |
| `batch_handler` | POST /batch | Parallel extraction with `Semaphore`-limited concurrency (default: CPU count) |
| `health_handler` | GET /health | Report status, version, uptime, feature availability (OCR, embeddings), cache stats |
| `formats_handler` | GET /formats | Return supported format categories (office, pdf, images, web, email, archives, academic) |
| `cache_stats_handler` | GET /cache/stats | Hit/miss counts and hit rate |
| `cache_clear_handler` | DELETE /cache | Clear LRU cache |

## Caching Strategy

**Location**: `crates/kreuzberg/src/cache/mod.rs`

LRU cache keyed by `SHA256(file_content)`, stores `Arc<ExtractionResult>`. Default 1000 entries. Thread-safe via `RwLock`. Tracks hit/miss counters with `AtomicU64` for stats endpoint.

## Error Handling

**Location**: `crates/kreuzberg/src/api/error.rs`

`ApiError` enum maps to HTTP status codes:

- `MissingFile` -> 400, `FileNotFound` -> 404
- `OnnxRuntimeMissing` / `TesseractMissing` -> 503 (with remediation message)
- `PayloadTooLarge` -> 413
- `ExtractionFailed` / `InvalidConfig` / `UnsupportedFormat` -> 500

## MCP Server Implementation

**Location**: `crates/kreuzberg/src/mcp/server.rs`

The MCP server allows Claude and other AI agents to call Kreuzberg extraction functions through the Model Context Protocol.

### MCP Tools (Callable Functions)

Three tools are registered:

| Tool | Purpose | Required Params |
|------|---------|----------------|
| `extract_file` | Extract text/tables/metadata from documents (75+ formats) | `file_path` |
| `batch_extract` | Extract from multiple documents in parallel | `file_paths[]` |
| `get_capabilities` | List supported formats, features, backends | (none) |

**Tool registration pattern** (example: `extract_file`):

```rust
// Define Tool with name, description, JSON Schema inputSchema
// Register with server.register_tool(tool, handler_fn)
// Handler: parse params -> build ExtractionConfig -> call extract_file() -> return ToolResult as JSON
```

`extract_file` optional params: `format`, `extract_tables`, `extract_images`, `ocr_enabled`, `extract_metadata`, `chunking_preset`, `generate_embeddings`.

### MCP Resources (Static Knowledge)

Three resources provide static information to agents:

- `kreuzberg://formats` -- Supported format list as JSON
- `kreuzberg://features` -- Cross-binding feature matrix (from `FEATURE_MATRIX.md`)
- `kreuzberg://api-reference` -- Generated API documentation

### MCP Prompts (Agent Templates)

Two prompts guide agent extraction workflows:

- `extract_for_rag` -- Document type-specific RAG extraction guidance (research paper, contract, report). Recommends chunking preset and embedding config.
- `batch_document_processing` -- Optimal concurrency, grouping, and error handling for batch workflows.

### MCP Transport Protocols

- **HTTP/REST**: MCP routes mounted alongside REST API on separate `/mcp/` prefix
- **Stdio**: JSON-RPC 2.0 over stdin/stdout for local CLI integration (e.g., Claude Desktop)

### Integration with Claude Desktop

```json
{
  "mcpServers": {
    "kreuzberg": {
      "command": "kreuzberg-mcp",
      "env": {
        "KREUZBERG_API_BASE": "http://localhost:8000",
        "KREUZBERG_MCP_TRANSPORT": "stdio"
      }
    }
  }
}
```

### MCP Error Handling

`ToolError` variants: `FileNotFound`, `UnsupportedFormat`, `ExtractionFailed`, `OnnxRuntimeMissing`, `TesseractMissing`, `Timeout`. Each maps to an MCP `ToolResultError` with descriptive code and message.

## Environment Configuration

See `.env.example` for all configurable variables. Key categories:

- **Server**: `KREUZBERG_HOST`, `KREUZBERG_PORT`
- **Size limits**: `KREUZBERG_MAX_REQUEST_BODY_BYTES` (default 100MB), `KREUZBERG_MAX_MULTIPART_FIELD_BYTES`
- **Features**: `KREUZBERG_ENABLE_OCR`, `KREUZBERG_ENABLE_EMBEDDINGS`, `KREUZBERG_ENABLE_KEYWORDS`
- **Cache**: `KREUZBERG_CACHE_ENABLED`, `KREUZBERG_CACHE_SIZE`
- **CORS**: `CORS_ALLOWED_ORIGINS` (comma-separated)
- **MCP**: `KREUZBERG_MCP_HOST`, `KREUZBERG_MCP_PORT`, `KREUZBERG_MCP_TRANSPORT` (stdio/http)
- **Logging**: `RUST_LOG=kreuzberg=info,tower_http=debug`

## Critical Rules

### REST API Rules

1. **Always validate multipart file uploads** - Check MIME type, size, magic bytes
2. **Timeout long-running extractions** - Set per-handler timeout (5 min default)
3. **Stream large files** - Never buffer entire multi-GB file in memory
4. **Cache aggressively** - Identical files should return from cache in <1ms
5. **Parallel extraction is CPU-bound** - Limit workers to CPU count + 1
6. **Error responses must be actionable** - Include error code and remediation suggestion
7. **Health checks must verify features** - Report missing dependencies (ONNX, Tesseract)
8. **Size limits are configurable** - Allow override via env var for large deployments
9. **CORS is permissive by default** - Restrict in production via env var
10. **Logging all requests** - Track extraction metrics for observability

### MCP Rules

1. **All tools must have timeout** - Prevent hanging on large files (default 5 min)
2. **Error responses must be detailed** - Include suggestions for missing dependencies
3. **Feature gates must be checked** - Return helpful message if feature unavailable (embeddings, OCR)
4. **Resources should be static** - Don't query external services in resource handlers
5. **Prompts guide agents** - Provide clear examples and best practices
6. **Batch tools must support cancellation** - Allow agent to stop long-running batch operations
7. **Logging all tool calls** - Track usage for analytics and debugging

## Related Skills

- **extraction-pipeline-patterns** - Core extraction called by handlers and MCP tools
- **chunking-embeddings** - Optional chunking/embedding parameters in extraction
- **ocr-backend-management** - OCR engine selection and image preprocessing