Best use case
audio-summary Skill is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
音频/视频转文本总结助手。
Teams using audio-summary Skill should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/audio-summary/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How audio-summary Skill Compares
| Feature / Agent | audio-summary Skill | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
音频/视频转文本总结助手。
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
SKILL.md Source
# audio-summary Skill 音频/视频转文本总结助手。 ## 功能 1. **自动音频提取**:使用 `ffmpeg` 从 MP4 等视频文件中提取 16k mono 压缩音频,以适配大模型体积限制。 2. **转录转总结**:基于百炼 `qwen3-asr-flash` 模型,自动将音频转换为文字并生成内容分段总结。 3. **大文件支持**:通过 48k 压缩,支持最长约 5-8 分钟的视频单次直接转录。 ## 依赖 - `ffmpeg` (已安装在系统路径) - `openai` Python SDK (已安装) - 百炼 API KEY (已在脚本中配置为 `sk-76735...`) ## 使用方法 ### 从命令行运行 ```powershell # 对指定视频进行提取和总结 python .openclaw/workspace/skills/audio-summary/audio_summary_skill.py "C:\Path\To\Your\Video.mp4" ``` ### 文件位置 - 提取出的总结文本将自动保存在视频同级目录下,并命名为 `视频名_summary.txt`。 ## 注意事项 - 目前单次 Base64 转录限制为 6MB,对于超过 10 分钟的长视频,建议先手动切分或进一步降低码率。 - API 费用按 `qwen3-asr-flash` 模型计费。
Related Skills
email-daily-summary
Automatically logs into email accounts (Gmail, Outlook, QQ Mail, etc.) and generates daily email summaries. Use when the user wants to get a summary of their emails, check important messages, or create daily email digests.
email-summary
Fetches recent emails from Gmail and provides concise summaries. Use when the user wants to check emails, get email summaries, or review their inbox.
youtube-audio-download
Download YouTube video audio and convert to MP3. Supports age-restricted videos with cookies.
audio-play
Play audio files using Windows media player. Non-blocking execution.
audio-rename
Rename audio files with Chinese/special characters to simple English names for mlx-stt compatibility.
audiobooklm
提供有声书创作与音频能力(ABS 读写、音效/音频检索、二创、音色推荐、章节角色分析等),通过 HTTP Streamable MCP 调用。
Daily Summary Skill - 每日总结技能
**Version:** 1.0.0
aibrary-podcast-summary
[Aibrary] Generate a book summary podcast script in a single-narrator storytelling style. Use when the user wants to turn a book into a podcast, create an audio summary of a book, or generate a summary-style podcast script. The output is a narrated monologue that distills a book's key ideas into an engaging 10-15 minute listening experience.
solax-summary-fetch
Fetch inverter summary data from the Solax Cloud API using the npm package solax-cloud-api. Use when the user provides (or has configured) a Solax tokenId and inverter serial number (sn) and wants current/summary energy data returned as JSON (typed as SolaxSummary) for dashboards/automation.
qwen-audio-lab
Hybrid text-to-speech, reusable voice cloning, and narrated audio generation for macOS plus Aliyun Qwen. Use when the user wants to convert text into speech, clone and reuse a voice from a reference recording, generate narration files from plain text or text files, or create PPT speaker-note voiceovers.
deapi-audio
Text-to-speech, voice cloning, voice design, and transcribe audio files via deAPI GPU network. Trigger on 'text to speech', 'TTS', 'generate voice', 'read aloud', 'voice clone', 'clone voice', 'voice design', 'design voice', 'custom voice', 'transcribe audio', 'STT'. For video/YouTube transcription use deapi-video instead.
Audio Transcription Skill
Auto-transcribe voice messages using faster-whisper (local, no API key needed).