transloadit-media-processing
Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.
Best use case
transloadit-media-processing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.
Teams using transloadit-media-processing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/transloadit-media-processing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How transloadit-media-processing Compares
| Feature / Agent | transloadit-media-processing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Top AI Agents for Productivity
See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
SKILL.md Source
# Transloadit Media Processing
Process, transform, and encode media files using Transloadit's cloud infrastructure.
Supports video, audio, images, and documents with 86+ specialized processing robots.
## When to Use This Skill
Use this skill when you need to:
- Encode video to HLS, MP4, WebM, or other formats
- Generate thumbnails or animated GIFs from video
- Resize, crop, watermark, or optimize images
- Convert between image formats (JPEG, PNG, WebP, AVIF, HEIF)
- Extract or transcode audio (MP3, AAC, FLAC, WAV)
- Concatenate video or audio clips
- Add subtitles or overlay text on video
- OCR documents (PDF, scanned images)
- Run speech-to-text or text-to-speech
- Apply AI-based content moderation or object detection
- Build multi-step media pipelines that chain operations together
## Setup
### Option A: MCP Server (recommended for Copilot)
Add the Transloadit MCP server to your IDE config. This gives the agent direct access
to Transloadit tools (`create_template`, `create_assembly`, `list_assembly_notifications`, etc.).
**VS Code / GitHub Copilot** (`.vscode/mcp.json` or user settings):
```json
{
"servers": {
"transloadit": {
"command": "npx",
"args": ["-y", "@transloadit/mcp-server", "stdio"],
"env": {
"TRANSLOADIT_KEY": "YOUR_AUTH_KEY",
"TRANSLOADIT_SECRET": "YOUR_AUTH_SECRET"
}
}
}
}
```
Get your API credentials at https://transloadit.com/c/-/api-credentials
### Option B: CLI
If you prefer running commands directly:
```bash
npx -y @transloadit/node assemblies create \
--steps '{"encoded": {"robot": "/video/encode", "use": ":original", "preset": "hls-1080p"}}' \
--wait \
--input ./my-video.mp4
```
## Core Workflows
### Encode Video to HLS (Adaptive Streaming)
```json
{
"steps": {
"encoded": {
"robot": "/video/encode",
"use": ":original",
"preset": "hls-1080p"
}
}
}
```
### Generate Thumbnails from Video
```json
{
"steps": {
"thumbnails": {
"robot": "/video/thumbs",
"use": ":original",
"count": 8,
"width": 320,
"height": 240
}
}
}
```
### Resize and Watermark Images
```json
{
"steps": {
"resized": {
"robot": "/image/resize",
"use": ":original",
"width": 1200,
"height": 800,
"resize_strategy": "fit"
},
"watermarked": {
"robot": "/image/resize",
"use": "resized",
"watermark_url": "https://example.com/logo.png",
"watermark_position": "bottom-right",
"watermark_size": "15%"
}
}
}
```
### OCR a Document
```json
{
"steps": {
"recognized": {
"robot": "/document/ocr",
"use": ":original",
"provider": "aws",
"format": "text"
}
}
}
```
### Concatenate Audio Clips
```json
{
"steps": {
"imported": {
"robot": "/http/import",
"url": ["https://example.com/clip1.mp3", "https://example.com/clip2.mp3"]
},
"concatenated": {
"robot": "/audio/concat",
"use": "imported",
"preset": "mp3"
}
}
}
```
## Multi-Step Pipelines
Steps can be chained using the `"use"` field. Each step references a previous step's output:
```json
{
"steps": {
"resized": {
"robot": "/image/resize",
"use": ":original",
"width": 1920
},
"optimized": {
"robot": "/image/optimize",
"use": "resized"
},
"exported": {
"robot": "/s3/store",
"use": "optimized",
"bucket": "my-bucket",
"path": "processed/${file.name}"
}
}
}
```
## Key Concepts
- **Assembly**: A single processing job. Created via `create_assembly` (MCP) or `assemblies create` (CLI).
- **Template**: A reusable set of steps stored on Transloadit. Created via `create_template` (MCP) or `templates create` (CLI).
- **Robot**: A processing unit (e.g., `/video/encode`, `/image/resize`). See full list at https://transloadit.com/docs/transcoding/
- **Steps**: JSON object defining the pipeline. Each key is a step name, each value configures a robot.
- **`:original`**: Refers to the uploaded input file.
## Tips
- Use `--wait` with the CLI to block until processing completes.
- Use `preset` values (e.g., `"hls-1080p"`, `"mp3"`, `"webp"`) for common format targets instead of specifying every parameter.
- Chain `"use": "step_name"` to build multi-step pipelines without intermediate downloads.
- For batch processing, use `/http/import` to pull files from URLs, S3, GCS, Azure, FTP, or Dropbox.
- Templates can include `${variables}` for dynamic values passed at assembly creation time.Related Skills
python-pypi-package-builder
End-to-end skill for building, testing, linting, versioning, and publishing a production-grade Python library to PyPI. Covers all four build backends (setuptools+setuptools_scm, hatchling, flit, poetry), PEP 440 versioning, semantic versioning, dynamic git-tag versioning, OOP/SOLID design, type hints (PEP 484/526/544/561), Trusted Publishing (OIDC), and the full PyPA packaging flow. Use for: creating Python packages, pip-installable SDKs, CLI tools, framework plugins, pyproject.toml setup, py.typed, setuptools_scm, semver, mypy, pre-commit, GitHub Actions CI/CD, or PyPI publishing.
mcp-security-audit
Audit MCP (Model Context Protocol) server configurations for security issues. Use this skill when: - Reviewing .mcp.json files for security risks - Checking MCP server args for hardcoded secrets or shell injection patterns - Validating that MCP servers use pinned versions (not @latest) - Detecting unpinned dependencies in MCP server configurations - Auditing which MCP servers a project registers and whether they're on an approved list - Checking for environment variable usage vs. hardcoded credentials in MCP configs - Any request like "is my MCP config secure?", "audit my MCP servers", or "check .mcp.json" keywords: [mcp, security, audit, secrets, shell-injection, supply-chain, governance]
lsp-setup
Enable code intelligence (go-to-definition, find-references, hover, type info) for any programming language by installing and configuring an LSP server for Copilot CLI. Detects the OS, installs the right server, and generates the JSON configuration (user-level or repo-level). Use when you need deeper code understanding and no LSP server is configured, or when the user asks to set up, install, or configure an LSP server.
gsap-framer-scroll-animation
Use this skill whenever the user wants to build scroll animations, scroll effects, parallax, scroll-triggered reveals, pinned sections, horizontal scroll, text animations, or any motion tied to scroll position — in vanilla JS, React, or Next.js. Covers GSAP ScrollTrigger (pinning, scrubbing, snapping, timelines, horizontal scroll, ScrollSmoother, matchMedia) and Framer Motion / Motion v12 (useScroll, useTransform, useSpring, whileInView, variants). Use this skill even if the user just says "animate on scroll", "fade in as I scroll", "make it scroll like Apple", "parallax effect", "sticky section", "scroll progress bar", or "entrance animation". Also triggers for Copilot prompt patterns for GSAP or Framer Motion code generation. Pairs with the premium-frontend-ui skill for creative philosophy and design-level polish.
freecad-scripts
Expert skill for writing FreeCAD Python scripts, macros, and automation. Use when asked to create FreeCAD models, parametric objects, Part/Mesh/Sketcher scripts, workbench tools, GUI dialogs with PySide, Coin3D scenegraph manipulation, or any FreeCAD Python API task. Covers FreeCAD scripting basics, geometry creation, FeaturePython objects, interface tools, and macro development.
write-coding-standards-from-file
Write a coding standards document for a project using the coding styles from the file(s) and/or folder(s) passed as arguments in the prompt.
workiq-copilot
Guides the Copilot CLI on how to use the WorkIQ CLI/MCP server to query Microsoft 365 Copilot data (emails, meetings, docs, Teams, people) for live context, summaries, and recommendations.
winmd-api-search
Find and explore Windows desktop APIs. Use when building features that need platform capabilities — camera, file access, notifications, UI controls, AI/ML, sensors, networking, etc. Discovers the right API for a task and retrieves full type details (methods, properties, events, enumeration values).
winapp-cli
Windows App Development CLI (winapp) for building, packaging, and deploying Windows applications. Use when asked to initialize Windows app projects, create MSIX packages, generate AppxManifest.xml, manage development certificates, add package identity for debugging, sign packages, publish to the Microsoft Store, create external catalogs, or access Windows SDK build tools. Supports .NET (csproj), C++, Electron, Rust, Tauri, and cross-platform frameworks targeting Windows.
webapp-testing
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
web-design-reviewer
This skill enables visual inspection of websites running locally or remotely to identify and fix design issues. Triggers on requests like "review website design", "check the UI", "fix the layout", "find design problems". Detects issues with responsive design, accessibility, visual consistency, and layout breakage, then performs fixes at the source code level.
web-coder
Expert 10x engineer with comprehensive knowledge of web development, internet protocols, and web standards. Use when working with HTML, CSS, JavaScript, web APIs, HTTP/HTTPS, web security, performance optimization, accessibility, or any web/internet concepts. Specializes in translating web terminology accurately and implementing modern web standards across frontend and backend development.