AudioEditor

AI-powered audio/video editing — transcription, intelligent cut detection, automated editing with crossfades, and optional cloud polish. USE WHEN clean audio, edit audio, remove filler words, clean podcast, remove ums, fix audio, cut dead air, polish audio, clean recording, transcribe and edit.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

AudioEditor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using AudioEditor should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/audioeditor/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/content-media/audioeditor/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/audioeditor/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How AudioEditor Compares

Feature / Agent	AudioEditor	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AudioEditor

AI-powered audio/video editing — transcription, intelligent cut detection, automated editing with crossfades, and optional cloud polish.

## Customization

**Before executing, check for user customizations at:**
`~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/AudioEditor/`

If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.

## Voice Notification

**You MUST send this notification BEFORE doing anything else when this skill is invoked.**

1. **Send voice notification**:
   ```bash
   curl -s -X POST http://localhost:8888/notify \
     -H "Content-Type: application/json" \
     -d '{"message": "Running the WORKFLOWNAME workflow in the AudioEditor skill to ACTION"}' \
     > /dev/null 2>&1 &
   ```

2. **Output text notification**:
   ```
   Running the **WorkflowName** workflow in the **AudioEditor** skill to ACTION...
   ```

**This is not optional. Execute this curl command immediately upon skill invocation.**

## Workflow Routing

| Workflow | Trigger | File |
|----------|---------|------|
| **Clean** | "clean audio", "edit audio", "remove filler words", "clean podcast", "remove ums", "cut dead air", "polish audio" | `Workflows/Clean.md` |

## Pipeline Architecture

```
Audio Input
    |
[Transcribe] Whisper word-level timestamps (insanely-fast-whisper on MPS)
    |
[Analyze] Claude classifies each segment:
    |   KEEP / CUT_FILLER / CUT_FALSE_START / CUT_EDIT_MARKER / CUT_STUTTER / CUT_DEAD_AIR
    |   Distinguishes rhetorical emphasis from accidental repetition
    |
[Edit] ffmpeg executes cuts:
    |   - 40ms qsin crossfades at every edit point
    |   - Room tone extraction and gap filling
    |   - Breath attenuation (50% volume, not removal)
    |
[Polish] (optional) Cleanvoice API final pass:
        - Mouth sound removal
        - Remaining filler detection
        - Loudness normalization

Output: cleaned MP3/WAV
```

## Tools

| Tool | Command | Purpose |
|------|---------|---------|
| **Transcribe** | `bun ~/.claude/skills/Utilities/AudioEditor/Tools/Transcribe.ts <file>` | Word-level transcription via Whisper |
| **Analyze** | `bun ~/.claude/skills/Utilities/AudioEditor/Tools/Analyze.ts <transcript.json>` | LLM-powered edit classification |
| **Edit** | `bun ~/.claude/skills/Utilities/AudioEditor/Tools/Edit.ts <file> <edits.json>` | Execute cuts with crossfades + room tone |
| **Polish** | `bun ~/.claude/skills/Utilities/AudioEditor/Tools/Polish.ts <file>` | Cleanvoice API cloud polish |
| **Pipeline** | `bun ~/.claude/skills/Utilities/AudioEditor/Tools/Pipeline.ts <file> [--polish]` | Full end-to-end pipeline |

## API Keys Required

| Service | Env Var | Where to Get |
|---------|---------|-------------|
| Anthropic (for analyze step) | `ANTHROPIC_API_KEY` | Already set via Claude Code |
| Cleanvoice (for polish step, optional) | `CLEANVOICE_API_KEY` | cleanvoice.ai Dashboard Settings API Key |

## Examples

**Example 1: Clean a podcast recording**
```
User: "clean up the audio on this podcast file"
-> Invokes Clean workflow
-> Runs full pipeline: transcribe -> analyze -> edit
-> Outputs cleaned MP3 with filler words, stutters, and dead air removed
```

**Example 2: Preview edits before applying**
```
User: "show me what edits you'd make to this recording"
-> Invokes Clean workflow with --preview flag
-> Transcribes and analyzes, shows proposed edits without modifying audio
-> User reviews edit list, then runs again to apply
```

**Example 3: Aggressive clean with cloud polish**
```
User: "aggressively clean this audio and polish it"
-> Invokes Clean workflow with --aggressive --polish flags
-> Tighter thresholds for filler detection
-> Cleanvoice API pass for mouth sounds and normalization
```

Related Skills

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

obsidian-daily

from diegosouzapw/awesome-omni-skill

Manage Obsidian Daily Notes via obsidian-cli. Create and open daily notes, append entries (journals, logs, tasks, links), read past notes by date, and search vault content. Handles relative dates like "yesterday", "last Friday", "3 days ago".

obsidian-additions

from diegosouzapw/awesome-omni-skill

Create supplementary materials attached to existing notes: experiments, meetings, reports, logs, conspectuses, practice sessions, annotations, AI outputs, links collections. Two-step process: (1) create aggregator space, (2) create concrete addition in base/additions/. INVOKE when user wants to attach any supplementary material to an existing note. Triggers: "addition", "create addition", "experiment", "meeting notes", "report", "conspectus", "log", "practice", "annotations", "links", "link collection", "аддишн", "конспект", "встреча", "отчёт", "эксперимент", "практика", "аннотации", "ссылки", "добавь к заметке".

observe

from diegosouzapw/awesome-omni-skill

Query and manage Observe using the Observe CLI. Use when the user wants to run OPAL queries, list datasets, manage objects, or interact with their Observe tenant from the command line.

observability-review

from diegosouzapw/awesome-omni-skill

AI agent that analyzes operational signals (metrics, logs, traces, alerts, SLO/SLI reports) from observability platforms (Prometheus, Datadog, New Relic, CloudWatch, Grafana, Elastic) and produces practical, risk-aware triage and recommendations. Use when reviewing system health, investigating performance issues, analyzing monitoring data, evaluating service reliability, or providing SRE analysis of operational metrics. Distinguishes between critical issues requiring action, items needing investigation, and informational observations requiring no action.

nvidia-nim

from diegosouzapw/awesome-omni-skill

NVIDIA NIM inference microservices for deploying AI models with OpenAI-compatible APIs, self-hosted or cloud

numpy-string-ops

from diegosouzapw/awesome-omni-skill

Vectorized string manipulation using the char module and modern string alternatives, including cleaning and search operations. Triggers: string operations, numpy.char, text cleaning, substring search.

nova-act-usability

from diegosouzapw/awesome-omni-skill

AI-orchestrated usability testing using Amazon Nova Act. The agent generates personas, runs tests to collect raw data, interprets responses to determine goal achievement, and generates HTML reports. Tests real user workflows (booking, checkout, posting) with safety guardrails. Use when asked to "test website usability", "run usability test", "generate usability report", "evaluate user experience", "test checkout flow", "test booking process", or "analyze website UX".

notebook-writer

from diegosouzapw/awesome-omni-skill

Create and document Jupyter notebooks for reproducible analyses

nomistakes

from diegosouzapw/awesome-omni-skill

Error prevention and best practices enforcement for agent-assisted coding. Use when writing code to catch common mistakes, enforce patterns, prevent bugs, validate inputs, handle errors, follow coding standards, avoid anti-patterns, and ensure code quality through proactive checks and guardrails.

nlss

from diegosouzapw/awesome-omni-skill

Workspace-first R statistics suite with subskills and agent-run metaskills (including run-demo for guided onboarding, explain-statistics for concept explanations, explain-results for interpreting outputs, format-document for NLSS format alignment, screen-data for diagnostics, check-assumptions for model-specific checks, and write-full-report for end-to-end reporting) that produce NLSS format tables/narratives and JSONL logs from CSV/SAV/RDS/RData/Parquet. Covers descriptives, frequencies/crosstabs, correlations, t-tests/ANOVA/nonparametric, regression/mixed models, SEM/CFA/mediation, EFA, power, reliability/scale analysis, assumptions, plots, missingness/imputation, data transforms, and workspace management.

nexus-bootstrap

from diegosouzapw/awesome-omni-skill

Enables your AI agent to discover and install skills from the Nexus Skills Marketplace. Install this skill first to unlock self-service skill management.