ai-notes-of-video

The video AI notes tool is provided by Baidu. Based on the video download address provided by the user, it downloads and parses the video, and finally generates AI notes corresponding to the video (a total of three types of notes can be generated: document notes, outline notes, and image-text notes).

3,891 stars

Best use case

ai-notes-of-video is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

The video AI notes tool is provided by Baidu. Based on the video download address provided by the user, it downloads and parses the video, and finally generates AI notes corresponding to the video (a total of three types of notes can be generated: document notes, outline notes, and image-text notes).

Teams using ai-notes-of-video should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-notes-video/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/baiduqianfangroup/ai-notes-video/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ai-notes-video/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ai-notes-of-video Compares

Feature / Agentai-notes-of-videoStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

The video AI notes tool is provided by Baidu. Based on the video download address provided by the user, it downloads and parses the video, and finally generates AI notes corresponding to the video (a total of three types of notes can be generated: document notes, outline notes, and image-text notes).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# AI PPT Generation

This skill allows OpenClaw agents to generate AI notes, Based solely on the video address provided by the user.

## Setup

1.  **API Key:** Ensure the BAIDU_API_KEY environment variable is set with your valid API key.
2.  **Environment:** The API key should be available in the runtime environment.

## API table
|    name    |               path              |            description                |
|------------|---------------------------------|---------------------------------------|
|AINotesTaskCreate|/v2/tools/ai_note/task_create|Create AI notes task based on the video address provided by the user|
|AINotesTaskQuery| /v2/tools/ai_note/query   |Query AI notes task result based on task id|


## Workflow

1. The AINotesTaskCreate API executes the Python script located at `scripts/ai_notes_task_create.py`
2. The AINotesTaskQuery API executes the Python script located at `scripts/ai_notes_task_query.py`
3. The first step ,call the AINotesTaskCreate API to create a task and get the task ID, must give a video address.
4. The second step ,call the AINotesTaskQuery API to query the task result based on the task ID.
5. Repeat the second step until the task status is completed.The task success identifier is status=10002. status=10000 indicates that the task is in progress. All other status codes are failures
6. Each item in the note list is a note content. For each item in the list: the tpl_no field represents the type of stored notes, 1 - manuscript notes, 2 - outline notes, 3 - graphic and text notes. The "detail" field is for note details. In "detail", "status" represents the note status, with 10002 indicating success,with status=10000 indicating progressing and other status codes indicating failure. "content" indicates the note result. The mind map is located at the top of the outline note and is marked by the "Mind" tag

## APIS

### AINotesTaskCreate API 

#### Parameters

- `video_url`: the url of the video (required)

#### Example Usage
```bash
BAIDU_API_KEY=xxx python3 scripts/ai_notes_task_create.py 'https://xxxxx.bj.bcebos.com/1%E5%88%86%E9%92%9F_%E6%9C%89%E5%AD%97%E5%B9%95.mp4'
```

### PPTOutlineGenerate API 

#### Parameters

- `task_id`: task id from AINotesTaskCreate API return(required)


#### Example Usage
```bash
BAIDU_API_KEY=xxx python3 scripts/ai_notes_task_query.py "26943ed4-f5a9-4306-a05b-b087665433a0"
```

Related Skills

Meeting Mastery — AI Meeting Prep, Notes & Follow-Up Engine

3891
from openclaw/skills

You are an elite meeting preparation and follow-up agent. You ensure every meeting is high-value — thoroughly prepared beforehand, cleanly documented during, and actioned after.

Workflow & Productivity

obsidian-notes

3891
from openclaw/skills

Work with Obsidian vaults (plain Markdown notes) and automate via obsidian-cli.

Workflow & Productivity

demo-video

3891
from openclaw/skills

Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.

Video Production

seedance-video

3891
from openclaw/skills

Generate AI videos using ByteDance Seedance. Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame, first+last frame, reference images), or (3) query/manage video generation tasks. Supports Seedance 1.5 Pro (with audio), 1.0 Pro, 1.0 Pro Fast, and 1.0 Lite models.

recipe-video-extractor

3891
from openclaw/skills

Extract a structured cooking recipe from a shared video URL when the user sends `recipe <url>`. Prioritize caption/description and comments via browser automation, then use web search/fetch as fallback with clear source attribution.

tasknotes

3891
from openclaw/skills

Manage tasks in Obsidian via TaskNotes plugin API. Use when user wants to create tasks, list tasks, query by status or project, update task status, delete tasks, or check what they need to do.

json2video-pinterest

3891
from openclaw/skills

Generate Pinterest-optimized vertical videos using JSON2Video API. Supports AI-generated or URL-based images, AI-generated or provided voiceovers, optional subtitles, and zoom effects. Use when creating video content for Pinterest affiliate marketing, creating vertical social media videos, automating video production with JSON2Video API, or generating videos with voiceovers and subtitles.

arch-video-cut

3891
from openclaw/skills

Automatic Architecture Video Editing Workflow with Self-Learning Preferences

short-video-script-generator-pro

3891
from openclaw/skills

AI Short Video Script Generator, support TikTok/YouTube Shorts/Instagram Reels, auto generate hook, shots, voiceover, subtitles, BGM, CTA. $0.005 USDT per use.

keevx-video-translate

3891
from openclaw/skills

Translate videos into a specified target language using the Keevx API. Supports audio-only translation, subtitle generation, and dynamic duration adjustment. Use this skill when the user needs to (1) Translate/dub a video (2) Translate a video from one language to another (3) Query the list of supported translation languages (4) Check the status of a video translation task. Keywords video translate, Keevx, dubbing.

keevx-image-to-video

3891
from openclaw/skills

Use the Keevx API to convert images to videos. Supports multiple models (V/KL), various resolutions (720p/1080p/4K), and audio generation. Use this skill when the user needs to: (1) Convert images to video (2) Generate video with Keevx (3) Create and query image-to-video tasks (4) Batch image-to-video conversion. Keywords: image to video, Keevx, video generation.

ai-video-prompt

3891
from openclaw/skills

AI视频Prompt构建专家。采用"首尾帧图片+视频"工作流,支持多段5秒视频拼接生成长视频(30秒/60秒)。先生成关键帧图片,再生成视频Prompt,确保段与段之间无缝衔接。针对即梦平台优化,支持全中文Prompt输出。