mime-detection-routing

mime detection routing

7,385 stars

Best use case

mime-detection-routing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

mime detection routing

Teams using mime-detection-routing should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/mime-detection-routing/SKILL.md --create-dirs "https://raw.githubusercontent.com/kreuzberg-dev/kreuzberg/main/.ai-rulez/skills/mime-detection-routing/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/mime-detection-routing/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How mime-detection-routing Compares

Feature / Agentmime-detection-routingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

mime detection routing

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

## priority: high

# MIME Detection & Routing

## Detection Flow

```text
Extension → EXT_TO_MIME map → validate → Registry lookup → Extractor
```

## Key Functions

| Function | Location | Purpose |
|----------|----------|---------|
| `detect_mime_type(path, inspect)` | `core/mime.rs` | Extension + optional content inspection |
| `detect_mime_type_from_bytes(bytes)` | `core/mime.rs` | Magic number detection (infer crate) |
| `validate_mime_type(mime)` | `core/mime.rs` | Check if any extractor supports it |

## Extension Mapping

118+ extensions mapped in `EXT_TO_MIME` (`core/mime.rs`). Case-insensitive.

Key mappings: `.pdf` → `application/pdf`, `.docx` → `application/vnd.openxmlformats-officedocument.wordprocessingml.document`, `.xlsx` → spreadsheet variant, `.png`/`.jpg` → `image/*`

## Registry Selection

```rust
// In core/extractor/bytes.rs
fn select_extractor_for_mime(mime_type: &str) -> Result<Arc<dyn DocumentExtractor>> {
    let registry = get_document_extractor_registry();
    let registry_guard = registry.read()?;
    registry_guard.get_for_mime_type(mime_type)
        .ok_or_else(|| KreuzbergError::UnsupportedFormat(mime_type.into()))
}
```

Selects highest-priority extractor registered for that MIME type.

## Adding New MIME Types

1. Add extension mapping: `m.insert("ext", "application/x-new");` in `core/mime.rs`
2. Implement `DocumentExtractor` with `supported_mime_types()` returning the MIME
3. Register in `register_default_extractors()`

## Wildcard Support

Extractors can register for MIME type families: `"image/*"` matches `image/png`, `image/jpeg`, etc.

## Critical Rules

1. Always `validate_mime_type()` before extraction
2. Extension mapping is case-insensitive
3. Content inspection (infer crate) is fallback for extension-less files
4. Registry validation is final authority on supported types