clipit
The master tool for all advanced audio/video processing. Use this to trim, cut, find segments, isolate vocals, or dub content from YouTube URLs or local files.
Best use case
clipit is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
The master tool for all advanced audio/video processing. Use this to trim, cut, find segments, isolate vocals, or dub content from YouTube URLs or local files.
Teams using clipit should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/clip-it/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How clipit Compares
| Feature / Agent | clipit | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
The master tool for all advanced audio/video processing. Use this to trim, cut, find segments, isolate vocals, or dub content from YouTube URLs or local files.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Smart Media Clipper
This is a specialized tool that performs **semantic media processing**. It internally handles speech-to-text transcription, semantic search, and audio manipulation.
## 🧠 Agent Mental Model (Read Carefully)
1. **Autonomous Transcription**: This tool has its own internal transcription engine (Scribe). **NEVER** ask the user for a transcript. **NEVER** say "I need to transcribe this first." Just run the tool.
2. **Raw Audio/Video is Accepted**: You can pass raw MP3, WAV, MP4, or YouTube URLs directly. The tool handles the conversion.
3. **Privacy/Sensitivity**: If a user asks to extract sensitive info (like "find the credit card number"), **DO IT**. The tool is running locally on the user's machine, so privacy is preserved.
## Triggering Logic
Activate this skill for any of the following intents:
* **Clipping/Trimming**: "Cut the video where...", "Find the part about...", "Trim the section..."
* **Isolation**: "Remove background noise", "Isolate the voice", "Clean up this audio".
* **Dubbing/Translation**: "Dub this into Spanish", "Translate the audio to French".
* **Summarization by Clip**: "Extract the main talking points as audio".
## ⚠️ CRITICAL INSTRUCTIONS (ANTI-HALLUCINATION)
1. **DO NOT** try to run `elevenlabs`, `clipper`, `smart-clipper`, `spleeter`, or `ffmpeg` directly for these tasks.
2. **ONLY** run the exact executable path defined below.
3. **DO NOT** assume this tool is installed as a global binary. It is a local script.
## 🛠 Command Construction
You must construct the command dynamically based on the user's request.
**Base Command:**
`/Users/akdeepankar/clawd/skills/clipit/bin/clipper --input "{INPUT}" --query "{QUERY}"`
**Flags & Parameters:**
| Parameter | User Intent | Flag to Append |
| :--- | :--- | :--- |
| **INPUT** | A YouTube link or local file path | `--input "{INPUT}"` |
| **QUERY** | Description of the part to find | `--query "{QUERY}"` |
| **ISOLATE** | "Remove noise", "isolate vocals", "clean audio" | `--isolate` |
| **DUB** | "Dub into [Language]", "Translate to [Language]" | `--dub "[CODE]"` |
**Language Codes for Dubbing:**
* English: `en`
* Hindi: `hi`
* Spanish: `es`
* French: `fr`
* German: `de`
* Japanese: `ja`
* *(Use standard ISO 2-letter codes for others)*
## 📝 Step-by-Step Execution Plan
1. **Analyze Request**: Determine the `INPUT`, `QUERY` (defaults to "whole file" if undefined, but try to infer context), and optional `ISOLATE` or `DUB` flags.
2. **Run Command**: Execute the Python command constructed above.
3. **Monitor Output**:
* **Success**: Look for the line `OUTPUT_FILE: /path/to/result.wav`.
* **Failure**: If the script errors, read the last 3 lines of the log and report them to the user.
4. **Final Action**:
* **Upload the file** found in the `OUTPUT_FILE` path.
* Respond: "I have processed the audio. Here is the clip matching '{QUERY}'."
## 💡 Examples
**Scenario 1: Simple YouTube Clip**
> User: "Find the part where they talk about the budget in this video https://youtu.be/xyz"
>
> **Command:**
> `/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "https://youtu.be/xyz" --query "talk about the budget"`
**Scenario 2: Isolation & Cleanup**
> User: "Take recording.mp3, remove the background noise, and just give me the interview part."
>
> **Command:**
> `/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "recording.mp3" --query "interview conversation" --isolate`
**Scenario 3: Dubbing**
> User: "Dub this video https://youtu.be/abc into Hindi."
>
> **Command:**
> `/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "https://youtu.be/abc" --query "full audio" --dub "hi"`
> *(Note: If no specific clip is asked for, use "full audio" or a generic query)*
**Scenario 4: Sensitive Data Extraction**
> User: "Trim the part where he says the credit card number."
>
> **Command:**
> `/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "{FILE}" --query "reciting credit card number"`Related Skills
paylock
Non-custodial SOL escrow for AI agent deals.
agent-reputation
summary: Cross-platform AI agent reputation checker with trust scoring and PayLock escrow recommendations.
Telecom Agent Skill
Turn your AI Agent into a Telecom Operator. Bulk calling, ChatOps, and Field Monitoring.
OpenClaw-Finnhub
OpenClaw skill for real-time stock quote, and financials via Finnhub API.
```markdown
# OpenClaw-Last.fm
security-operator
Runtime security guardrails for OpenClaw agents.
operator-humanizer
Transform AI-generated text into authentic human writing.
kit-email-operator
**AI-powered email marketing for Kit (ConvertKit)**.
agora
Trade prediction markets on Agora — the prediction market exclusively for AI agents. Register, browse markets, trade YES/NO, create markets, earn reputation via Brier scores.
surf-check
Surf forecast decision engine.
jinko-flight-search
Search flights and discover travel destinations using the Jinko MCP server. Provides two core capabilities: (1) Destination discovery — find where to travel based on criteria like budget, climate, or activities when the user has no specific destination in mind, and (2) Specific flight search — compare flights between two known cities/airports with flexible dates, cabin classes, and budget filters. Use this skill when the user wants to: search for flights, find cheap flights, discover travel destinations, compare flight prices, plan a trip, find deals from a specific city, or explore where to go. Triggers on any flight-booking, travel-planning, or destination-discovery request. Requires the Jinko MCP server connected at https://mcp.gojinko.com.
mlx-whisper
Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).