transloadit-media-processing

Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.

23 stars

Best use case

transloadit-media-processing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.

Teams using transloadit-media-processing should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/transloadit-media-processing/SKILL.md --create-dirs "https://raw.githubusercontent.com/christophacham/agent-skills-library/main/skills/database/transloadit-media-processing/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/transloadit-media-processing/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How transloadit-media-processing Compares

Feature / Agenttransloadit-media-processingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Transloadit Media Processing

Process, transform, and encode media files using Transloadit's cloud infrastructure.
Supports video, audio, images, and documents with 86+ specialized processing robots.

## When to Use This Skill

Use this skill when you need to:

- Encode video to HLS, MP4, WebM, or other formats
- Generate thumbnails or animated GIFs from video
- Resize, crop, watermark, or optimize images
- Convert between image formats (JPEG, PNG, WebP, AVIF, HEIF)
- Extract or transcode audio (MP3, AAC, FLAC, WAV)
- Concatenate video or audio clips
- Add subtitles or overlay text on video
- OCR documents (PDF, scanned images)
- Run speech-to-text or text-to-speech
- Apply AI-based content moderation or object detection
- Build multi-step media pipelines that chain operations together

## Setup

### Option A: MCP Server (recommended for Copilot)

Add the Transloadit MCP server to your IDE config. This gives the agent direct access
to Transloadit tools (`create_template`, `create_assembly`, `list_assembly_notifications`, etc.).

**VS Code / GitHub Copilot** (`.vscode/mcp.json` or user settings):

```json
{
  "servers": {
    "transloadit": {
      "command": "npx",
      "args": ["-y", "@transloadit/mcp-server", "stdio"],
      "env": {
        "TRANSLOADIT_KEY": "YOUR_AUTH_KEY",
        "TRANSLOADIT_SECRET": "YOUR_AUTH_SECRET"
      }
    }
  }
}
```

Get your API credentials at https://transloadit.com/c/-/api-credentials

### Option B: CLI

If you prefer running commands directly:

```bash
npx -y @transloadit/node assemblies create \
  --steps '{"encoded": {"robot": "/video/encode", "use": ":original", "preset": "hls-1080p"}}' \
  --wait \
  --input ./my-video.mp4
```

## Core Workflows

### Encode Video to HLS (Adaptive Streaming)

```json
{
  "steps": {
    "encoded": {
      "robot": "/video/encode",
      "use": ":original",
      "preset": "hls-1080p"
    }
  }
}
```

### Generate Thumbnails from Video

```json
{
  "steps": {
    "thumbnails": {
      "robot": "/video/thumbs",
      "use": ":original",
      "count": 8,
      "width": 320,
      "height": 240
    }
  }
}
```

### Resize and Watermark Images

```json
{
  "steps": {
    "resized": {
      "robot": "/image/resize",
      "use": ":original",
      "width": 1200,
      "height": 800,
      "resize_strategy": "fit"
    },
    "watermarked": {
      "robot": "/image/resize",
      "use": "resized",
      "watermark_url": "https://example.com/logo.png",
      "watermark_position": "bottom-right",
      "watermark_size": "15%"
    }
  }
}
```

### OCR a Document

```json
{
  "steps": {
    "recognized": {
      "robot": "/document/ocr",
      "use": ":original",
      "provider": "aws",
      "format": "text"
    }
  }
}
```

### Concatenate Audio Clips

```json
{
  "steps": {
    "imported": {
      "robot": "/http/import",
      "url": ["https://example.com/clip1.mp3", "https://example.com/clip2.mp3"]
    },
    "concatenated": {
      "robot": "/audio/concat",
      "use": "imported",
      "preset": "mp3"
    }
  }
}
```

## Multi-Step Pipelines

Steps can be chained using the `"use"` field. Each step references a previous step's output:

```json
{
  "steps": {
    "resized": {
      "robot": "/image/resize",
      "use": ":original",
      "width": 1920
    },
    "optimized": {
      "robot": "/image/optimize",
      "use": "resized"
    },
    "exported": {
      "robot": "/s3/store",
      "use": "optimized",
      "bucket": "my-bucket",
      "path": "processed/${file.name}"
    }
  }
}
```

## Key Concepts

- **Assembly**: A single processing job. Created via `create_assembly` (MCP) or `assemblies create` (CLI).
- **Template**: A reusable set of steps stored on Transloadit. Created via `create_template` (MCP) or `templates create` (CLI).
- **Robot**: A processing unit (e.g., `/video/encode`, `/image/resize`). See full list at https://transloadit.com/docs/transcoding/
- **Steps**: JSON object defining the pipeline. Each key is a step name, each value configures a robot.
- **`:original`**: Refers to the uploaded input file.

## Tips

- Use `--wait` with the CLI to block until processing completes.
- Use `preset` values (e.g., `"hls-1080p"`, `"mp3"`, `"webp"`) for common format targets instead of specifying every parameter.
- Chain `"use": "step_name"` to build multi-step pipelines without intermediate downloads.
- For batch processing, use `/http/import` to pull files from URLs, S3, GCS, Azure, FTP, or Dropbox.
- Templates can include `${variables}` for dynamic values passed at assembly creation time.

Related Skills

uniprot-database

23
from christophacham/agent-skills-library

Direct REST API access to UniProt. Protein searches, FASTA retrieval, ID mapping, Swiss-Prot/TrEMBL. For Python workflows with multiple databases, prefer bioservices (unified interface to 40+ services). Use this for direct HTTP/REST work or UniProt-specific control.

tsdown

23
from christophacham/agent-skills-library

Bundle TypeScript and JavaScript libraries with blazing-fast speed powered by Rolldown. Use when building libraries, generating type declarations, bundling for multiple formats, or migrating from tsup.

treatment-plans

23
from christophacham/agent-skills-library

Generate concise (3-4 page), focused medical treatment plans in LaTeX/PDF format for all clinical specialties. Supports general medical treatment, rehabilitation therapy, mental health care, chronic disease management, perioperative care, and pain management. Includes SMART goal frameworks, evidence-based interventions with minimal text citations, regulatory compliance (HIPAA), and professional formatting. Prioritizes brevity and clinical actionability.

transformers

23
from christophacham/agent-skills-library

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

tapform-automation

23
from christophacham/agent-skills-library

Automate Tapform tasks via Rube MCP (Composio). Always search tools first for current schemas.

supabase-automation

23
from christophacham/agent-skills-library

Automate Supabase database queries, table management, project administration, storage, edge functions, and SQL execution via Rube MCP (Composio). Always search tools first for current schemas.

string-database

23
from christophacham/agent-skills-library

Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.

stormglass-io-automation

23
from christophacham/agent-skills-library

Automate Stormglass IO tasks via Rube MCP (Composio). Always search tools first for current schemas.

statistical-analysis

23
from christophacham/agent-skills-library

Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.

sqlmap-database-pentesting

23
from christophacham/agent-skills-library

This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns...

sql-pro

23
from christophacham/agent-skills-library

Master modern SQL with cloud-native databases, OLTP/OLAP optimization, and advanced query techniques. Expert in performance tuning, data modeling, and hybrid analytical systems.

sql-optimization

23
from christophacham/agent-skills-library

Universal SQL performance optimization assistant for comprehensive query tuning, indexing strategies, and database performance analysis across all SQL databases (MySQL, PostgreSQL, SQL Server, Oracle). Provides execution plan analysis, pagination optimization, batch operations, and performance monitoring guidance.