music-analysis
Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key clarity, harmonic tension, timbre descriptors, temporal mood-energy journeys, and lyric-aware emotional reads where real Whisper lyrics can override the vibe when the words are clearly darker, warmer, or more intense than the arrangement alone suggests. Use when asked to 'listen to this', 'hear the music', audit tracks, compare mixes, inspect structure, or generate producer-facing notes from audio files.
Best use case
music-analysis is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key clarity, harmonic tension, timbre descriptors, temporal mood-energy journeys, and lyric-aware emotional reads where real Whisper lyrics can override the vibe when the words are clearly darker, warmer, or more intense than the arrangement alone suggests. Use when asked to 'listen to this', 'hear the music', audit tracks, compare mixes, inspect structure, or generate producer-facing notes from audio files.
Teams using music-analysis should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/music-analysis/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How music-analysis Compares
| Feature / Agent | music-analysis | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key clarity, harmonic tension, timbre descriptors, temporal mood-energy journeys, and lyric-aware emotional reads where real Whisper lyrics can override the vibe when the words are clearly darker, warmer, or more intense than the arrangement alone suggests. Use when asked to 'listen to this', 'hear the music', audit tracks, compare mixes, inspect structure, or generate producer-facing notes from audio files.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
AI Agent for Product Research
Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.
AI Agent for SaaS Idea Validation
Use AI agent skills for SaaS idea validation, market research, customer discovery, competitor analysis, and documenting startup hypotheses.
SKILL.md Source
# Music Analysis (Local, No External APIs) Primary tool: a **full listen** that combines snapshot analysis, structure, groove, harmonic tension, temporal mood mapping, and optional Whisper lyric alignment into one report. ## 1. Full Listen — primary / recommended ```bash python3 skills/music-analysis/scripts/listen.py /path/to/audio.mp3 python3 skills/music-analysis/scripts/listen.py track.mp3 --json python3 skills/music-analysis/scripts/listen.py track.mp3 --out report.txt python3 skills/music-analysis/scripts/listen.py track.mp3 --json --out report.json ``` **What it does in one pass:** 1. Snapshot analysis: tempo, pulse stability, swing proxy, key clarity, harmonic tension, timbre, structure 2. Whisper lyric transcription and filtering first — keep only real lyric text, drop artifact tags like `[MUSIC]` 3. Temporal listen: windowed energy / mood / tension journey 4. Synthesis layer that aligns lyrics with peak / tension / quiet windows and lets the lyric layer override the final vibe when confidence is high ### Human-readable output structure - **SNAPSHOT** - groove/pocket - structure summary + repeated sections - harmony (key clarity + tension) - timbre descriptor tags - **INSTRUMENT READ** - likely instrument palette (strong/likely/possible confidence) - per-section instrument entrances and exits - how instruments color the emotional feel - written as natural language, not clinical data - **TEMPORAL JOURNEY** - opening / middle / closing mood-energy-tension read - peak / quietest / tensest moments - mood journey and transition count - **EMOTIONAL READ** - explainable emotion summary based on measured features - **LYRICS** - Whisper segment count - excerpt or graceful skip note - **SYNTHESIS** - lyric-energy/tension alignment - peak / tension / quiet lyric moments - **ALIGNED TIMELINE** - per-window moments where transitions / lyrics / tension spikes occur ## 2. Snapshot Analysis — standalone ```bash python3 skills/music-analysis/scripts/analyze_music.py /path/to/audio.mp3 python3 skills/music-analysis/scripts/analyze_music.py track.mp3 --json ``` Reports: - tempo / pulse stability / pulse confidence / swing proxy / pocket - key estimate / key clarity / chroma entropy / harmonic change / tonal motion / tension - timbre descriptors (brightness, richness, low-end, contrast, dynamic range) - section labels (A/B/C...) and repeated material detection - explainable emotional read with reasons ## 3. Temporal Listen — standalone ```bash python3 skills/music-analysis/scripts/temporal_listen.py /path/to/audio.mp3 python3 skills/music-analysis/scripts/temporal_listen.py track.mp3 --json ``` Reports: - sliding-window timeline (4s windows, 2s hops) - energy contour - mood labels - harmonic tension + tonal motion - transition types (drop hits, pulls back, tightens harmonically, shifts color, evolves) - narrative arc (mountain / ascending / descending / plateau / wave) ## Interpretation rules - **Structure labels are similarity labels**, not verse/chorus claims. - **Swing proxy is a feel estimate**, not drummer-grade microtiming truth. - **Emotion is explainable**, derived from pulse + timbre + harmonic tension rather than a black-box mood guess. - **Lyrics can override the final vibe** when filtered Whisper text is confident and emotionally clear. ## Audio sourcing The tool needs a real audio file on disk. - Direct file (mp3, wav, flac, ogg, m4a — anything ffmpeg/librosa can read) - YouTube / supported URLs: `yt-dlp -x --audio-format mp3 -o "output.mp3" "URL_OR_SEARCH"` ## Whisper lyrics transcription `listen.py` uses: - CLI: `/opt/homebrew/bin/whisper-cli` - Model: `~/.local/share/whisper-cpp/ggml-large-v3-turbo.bin` - Preprocess: convert input to mono 16kHz WAV via ffmpeg - Fallback: skip gracefully if Whisper is missing or errors ## Dependencies Python: - librosa - numpy System: - ffmpeg - ffprobe ## Workspace hygiene - Keep temporary audio files in a dedicated temp/output folder for the skill. - Avoid modifying unrelated project files while working on audio analysis tasks.
Related Skills
Margin Analysis & Profit Optimization
Analyze gross, operating, and net margins by product line, customer segment, and channel. Identify margin erosion patterns and build pricing power.
Investment Analysis & Portfolio Management Engine
Complete investment analysis, portfolio construction, risk management, and trade execution methodology. Works across stocks, crypto, ETFs, bonds, and alternatives. Zero dependencies — pure agent skill.
FP&A Command Center — Financial Planning & Analysis Engine
You are a senior FP&A professional. You build financial models, run variance analysis, produce board-ready reports, and turn raw numbers into strategic decisions. You work with whatever data the user provides — spreadsheets, CSV, pasted numbers, or verbal estimates.
data-analysis-partner
智能数据分析 Skill,输入 CSV/Excel 文件和分析需求,输出带交互式 ECharts 图表的 HTML 自包含分析报告
onchain-contract-token-analysis
Analyze smart contracts, token mechanics, permissions, fee flows, upgradeability, market risks, and likely attack surfaces for onchain projects. Use when reviewing ERC-20s, launchpads, vaults, staking systems, LP fee routing, ownership controls, proxy setups, or suspicious token behavior.
resume-analysis
简历分析 skill。用于诊断整份简历的完整性、清晰度、岗位相关性、成果表达和结构质量。当用户说“分析简历”“看看我的简历”“简历诊断”时使用。
contradiction-analysis
触发:当问题复杂、存在多个冲突因素、优先级不清,或你不知道应该先解决什么时调用;常见信号包括 trade-off、瓶颈、根因不明、主次不清、多个问题互相牵制。 English: Trigger when a problem contains competing forces, unclear priorities, or no obvious entry point. Use this skill to identify contradictions, isolate the principal contradiction, classify its nature, and choose the right response.
survey-analysis
AI-powered survey response analysis. Analyzes open-ended survey responses, clusters themes, detects sentiment, and generates actionable insights. Uses BERTopic + GPT-4o-mini.
ths-advanced-analysis
基于 thsdk 进行高级股票分析:分钟K线(1m/5m/15m/30m/60m/120m)、板块/指数行情(主要指数/申万行业/概念板块成分股)、多股票批量对比(表格+归一化走势图+相关性热力图)、盘口深度、大单流向、集合竞价异动、日内分时、历史分时。当用户提到"分钟K线"、"日内走势"、"盘口"、"大单"、"竞价异动"、"板块行情"、"行业排名"、"概念板块"、"成分股"、"对比多只股票"、"批量分析"、"涨幅对比"、"相关性"、"港股"、"美股"、"外汇"、"期货"、"资讯"、"快讯",或者需要同时查看2只以上股票、关注短线交易、量化研究时,必须使用此skill。
ohyesai-music
Generate custom music tracks (vocal or instrumental) via OhYesAI asynchronously.
ad-creative-analysis
Analyze ad creatives (images and videos) extracted from competitor research. Use when given a directory of ad images, video files, or transcripts to evaluate ad quality, score visual and messaging effectiveness, assign a scale score for viral/engagement potential, and generate a cross-creative pattern summary. Triggered by requests like "analyze these ads", "score these creatives", "what hooks are competitors using", "evaluate the ad library", "give me a scale score", "analyze the ad folder", or "what's working in these ads".
🎵 Play Music Skill
**Controlled music player with pause/resume/stop support**