bilibili-transcript

Transcribe Bilibili videos to text with high accuracy using Whisper medium model. Use when the user provides a Bilibili video URL (BVxxxxx) and wants to: (1) Extract the complete audio content as text with high accuracy, (2) Get a detailed summary of the video content, (3) Save the transcript as a formatted TXT file instead of posting long text to Discord. Automatically detects CC subtitles if available, otherwise uses Whisper medium model with GPU acceleration. Output saves to 'Bilibili transcript' folder by default, includes video metadata, summary section, and full transcript in Simplified Chinese.

3,891 stars

Best use case

bilibili-transcript is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Transcribe Bilibili videos to text with high accuracy using Whisper medium model. Use when the user provides a Bilibili video URL (BVxxxxx) and wants to: (1) Extract the complete audio content as text with high accuracy, (2) Get a detailed summary of the video content, (3) Save the transcript as a formatted TXT file instead of posting long text to Discord. Automatically detects CC subtitles if available, otherwise uses Whisper medium model with GPU acceleration. Output saves to 'Bilibili transcript' folder by default, includes video metadata, summary section, and full transcript in Simplified Chinese.

Teams using bilibili-transcript should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bilibili-transcript/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/54lynnn/bilibili-transcript/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/bilibili-transcript/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How bilibili-transcript Compares

Feature / Agentbilibili-transcriptStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Transcribe Bilibili videos to text with high accuracy using Whisper medium model. Use when the user provides a Bilibili video URL (BVxxxxx) and wants to: (1) Extract the complete audio content as text with high accuracy, (2) Get a detailed summary of the video content, (3) Save the transcript as a formatted TXT file instead of posting long text to Discord. Automatically detects CC subtitles if available, otherwise uses Whisper medium model with GPU acceleration. Output saves to 'Bilibili transcript' folder by default, includes video metadata, summary section, and full transcript in Simplified Chinese.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Bilibili Transcript v2.2

High-accuracy Bilibili video transcription with multi-language AI subtitle support.

## Overview

This skill provides a **complete transcription workflow** for Bilibili videos:

1. **Extract Video Metadata** - Title, author, publish date, duration
2. **Smart Subtitle Detection** - Priority: CC subtitles → AI subtitles (multi-language) → Whisper transcription
3. **Multi-language AI Subtitle Support** - Auto-detects: `ai-zh`, `ai-en`, `ai-ja`, `ai-es`, `ai-ar`, `ai-pt`, `ai-ko`, `ai-de`, `ai-fr`
4. **Browser Cookie Support** - WSL Chromium or Windows Edge for member-only videos
5. **Formatted Output** - Saves as structured TXT file with metadata + summary placeholder + full transcript
6. **Simplified Chinese** - Automatically converts Traditional to Simplified Chinese

## What's New in v2.2

- ✅ **Fixed cookie detection** - Now uses browser config directory instead of SQLite file (avoids encoding errors)
- ✅ **One-stop solution** - CC subtitles → AI subtitles → Whisper transcription, all in one script
- ✅ **Better WSL support** - Automatically detects WSL Chromium and Windows Edge cookies
- ✅ **Smart fallback** - Seamlessly switches between subtitle sources without user intervention

## What's New in v2.1

- ✅ **Improved cookie handling** - Fixed UTF-8 encoding issues with snap Chromium
- ✅ **Three-tier fallback** - CC subtitles → AI subtitles → Whisper transcription
- ✅ **Better error handling** - Gracefully degrades when cookie sources fail

## What's New in v2.0

- ✅ **Multi-language AI subtitles** - Supports 9 languages: Chinese, English, Japanese, Spanish, Arabic, Portuguese, Korean, German, French
- ✅ **WSL Chromium support** - Better cookie extraction than Windows Edge
- ✅ **Correct subtitle download** - Uses `--write-subs --write-auto-subs` combo
- ✅ **Language auto-detection** - Automatically finds available AI subtitle language

## AI Subtitle Language Codes

Bilibili uses `ai-` prefix for AI-generated subtitles:

| Code | Language | 语言 |
|------|----------|------|
| `ai-zh` | Chinese | 中文 |
| `ai-en` | English | 英文 |
| `ai-ja` | Japanese | 日文 |
| `ai-es` | Spanish | 西班牙文 |
| `ai-ar` | Arabic | 阿拉伯文 |
| `ai-pt` | Portuguese | 葡萄牙文 |
| `ai-ko` | Korean | 韩文 |
| `ai-de` | German | 德文 |
| `ai-fr` | French | 法文 |

## Requirements

### Hardware (Your Setup)
- **GPU**: NVIDIA RTX 4070 Super (12GB VRAM) - ✅ Perfect for medium model
- **WSL Memory**: 16GB (configured)
- **WSL CPU**: 6 cores (configured)

### Software
- `yt-dlp` - Video/audio download
- `ffmpeg` - Audio processing
- `whisper` - Speech-to-text (local, no API key)
- `opencc` - Traditional to Simplified Chinese conversion (optional)

### Browser (for AI subtitles)
- **WSL Chromium** (recommended) - Log in to Bilibili in WSL
- **Windows Edge** - Alternative option

## Workflow

### Step 1: Run Transcription Script

```bash
./scripts/bilibili_transcript.sh "https://www.bilibili.com/video/BVxxxxx"
```

**Priority order:**
1. **CC Subtitles** (manual) - Fastest, highest accuracy
2. **AI Subtitles** (auto-generated) - Fast, good accuracy, multi-language
3. **Whisper Transcription** - Slowest, ~95% accuracy, works for all videos

### Step 2: Generate Detailed Summary

After the script completes, read the generated TXT file and:
1. Read the full transcript (第二部分)
2. Generate a comprehensive summary (第一部分)
3. Save the updated file

### Step 3: Present to User

In Discord, post:
- **Brief summary** in message
- **Attach the TXT file** for full content

## Setup WSL Chromium Login

For best results with AI subtitles:

1. Start WSL Chromium:
   ```bash
   chromium-browser &
   ```

2. Navigate to bilibili.com

3. Log in with your Bilibili account

4. Run the transcription script

The script will automatically use Chromium's cookies to access member-only AI subtitles.

## Usage Examples

### Example 1: Basic Transcription (Default Output)
```bash
./scripts/bilibili_transcript.sh "https://www.bilibili.com/video/BV1Z1wJzgEAj/"
# Output: workspace/Bilibili transcript/[VideoTitle]_BVxxxxx_transcript.txt
```

### Example 2: Custom Output Directory
```bash
./scripts/bilibili_transcript.sh "https://www.bilibili.com/video/BV1Z1wJzgEAj/" ~/Documents
```

## Notes

### Model Selection
- **Your config**: RTX 4070 Super 12GB + 16GB RAM + 6 cores
- **Default**: `medium` model (~95% accuracy, balanced speed) ✅
- **Fallback**: If GPU unavailable, automatically uses CPU (slower)

### Accuracy Comparison
| Source | Accuracy | Speed | Best For |
|--------|----------|-------|----------|
| CC Subtitles | 100% | ⚡ Instant | All videos with manual subtitles |
| AI Subtitles (ai-zh) | ~90% | ⚡ Instant | Chinese videos |
| AI Subtitles (ai-en) | ~85% | ⚡ Instant | English videos |
| Whisper medium | ~95% | 🐢 Slow | No subtitle videos |

### Default Output Directory
- **Location**: `workspace/Bilibili transcript/`
- **Created automatically** on first run
- All transcript files organized in one place

### File Naming
Output files are named: `[VideoTitle]_[BVID]_transcript.txt`
- Special characters (including Chinese punctuation) are replaced with underscores
- Title truncated to 50 characters
- Example: `股票分红_是从左口袋掏右口袋吗_BV1ddzUYTE27_transcript.txt`

### Subtitle Priority
The script tries subtitles in this order:
1. Manual CC subtitles (zh-CN, zh-TW, en, ja, etc.)
2. AI subtitles (any available language: ai-zh, ai-en, ai-ja, etc.)
3. Whisper voice transcription (fallback)

This ensures fastest processing while maintaining high accuracy.

Related Skills

transcription

3891
from openclaw/skills

Transcribe audio and video files using the Signal Loom AI API. Supports MP3, WAV, M4A, MP4, MOV, and more. Runs locally on Apple Silicon for speed and privacy.

spatial-transcriptomics-mapper

3891
from openclaw/skills

Map spatial transcriptomics data from 10x Genomics Visium/Xenium onto.

bilibili-sentiment-dashboard

3891
from openclaw/skills

B站/哔哩哔哩视频运营分析。当用户询问B站/B站视频/Bilibili的视频运营分析,评论情绪、评论区情感、弹幕情绪、口碑、正评负评、好评差评时触发。支持BV号、AV号或视频链接。

bilibili-downloader

3891
from openclaw/skills

Download videos, audio, subtitles, and covers from Bilibili using bilibili-api. Use when working with Bilibili content for downloading videos in various qualities, extracting audio, getting subtitles and danmaku, downloading covers, and managing download preferences.

bilibili-subtitle

3891
from openclaw/skills

处理 B 站(哔哩哔哩)视频字幕的完整工作流。能力包括:(1)引导用户扫码登录获取 Cookie;(2)自动获取视频字幕(优先 AI 字幕,自动回退 CC 字幕);(3)生成视频内容摘要;(4)当用户询问"某内容在哪个时间段"时,从字幕文件中精准定位时间戳。当用户提供 B 站视频链接、BVID,或提到"b站字幕""帮我看视频""视频摘要""视频哪里讲到了""字幕提取"等任何与 B 站视频内容理解相关的场景时,务必使用此 skill。即使用户只是随口提到一个 bilibili 链接,也应该触发此 skill。

YouTube Transcript Extraction

3891
from openclaw/skills

Extract high-quality transcripts from YouTube videos using multiple methods.

Audio Transcription Skill

3891
from openclaw/skills

Auto-transcribe voice messages using faster-whisper (local, no API key needed).

youtube-transcript

3891
from openclaw/skills

Extrahiert Transkripte von YouTube-Videos für Content-Erstellung. Nutze für Video-Analysen, Content-Ideen und Blog-Posts aus YouTube-Videos.

dual-disease-transcriptomic-ml-planner

3891
from openclaw/skills

Generates complete dual-disease transcriptomic + machine learning research designs from a user-provided disease pair. Use when users want to identify shared DEGs, common hub genes, cross-disease biomarkers, or shared molecular mechanisms between two diseases using public GEO data. Triggers: "shared biomarker study for two diseases", "dual-disease transcriptomic ML paper", "identify common DEGs between disease A and B", "cross-disease hub gene discovery", "shared DEG + PPI + ROC design", "immune infiltration shared biomarker", or "I want to study disease X and Y together". Always outputs four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.

Bilibili

3891
from openclaw/skills

Design videos for cultural resonance on Bilibili. Analyze danmu psychology, meme triggers, collective reaction points, and community-native emotional beats to increase deep engagement.

bilibili-ai-subtitle

3891
from openclaw/skills

Download Bilibili AI-generated subtitles (auto-subtitles) for videos. Use when you need to quickly get subtitles from Bilibili videos that have AI-generated captions. Supports 9 languages: Chinese, English, Japanese, Spanish, Arabic, Portuguese, Korean, German, French. Language priority can be customized.

bilibili-insight

3891
from openclaw/skills

B 站 UP 主分析、视频数据追踪、粉丝画像、爆款内容拆解。适合内容创作者、品牌方、MCN 机构。