podcastfy-clawdbot
Generate an AI podcast (MP3) from one or more URLs using the open-source Podcastfy project. Use when the user says “make a podcast from this URL/article/video/PDF”, “turn this webpage into a podcast”, or wants an MP3 conversation-style summary from links. Uses Gemini for transcript generation via GEMINI_API_KEY and Edge TTS for free voice.
Best use case
podcastfy-clawdbot is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generate an AI podcast (MP3) from one or more URLs using the open-source Podcastfy project. Use when the user says “make a podcast from this URL/article/video/PDF”, “turn this webpage into a podcast”, or wants an MP3 conversation-style summary from links. Uses Gemini for transcript generation via GEMINI_API_KEY and Edge TTS for free voice.
Teams using podcastfy-clawdbot should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/podcastfy-clawdbot/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How podcastfy-clawdbot Compares
| Feature / Agent | podcastfy-clawdbot | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate an AI podcast (MP3) from one or more URLs using the open-source Podcastfy project. Use when the user says “make a podcast from this URL/article/video/PDF”, “turn this webpage into a podcast”, or wants an MP3 conversation-style summary from links. Uses Gemini for transcript generation via GEMINI_API_KEY and Edge TTS for free voice.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Podcastfy (Clawdbot)
Generate a podcast-style audio conversation (MP3) from a URL (or multiple URLs) using `podcastfy`.
This skill provides a wrapper script that:
- creates/uses a local venv (`{baseDir}/.venv`)
- installs/updates `podcastfy`
- runs Podcastfy with **Gemini** for transcript generation and **Edge** for TTS
## One-time setup
1) Ensure `ffmpeg` is installed on the host.
Ubuntu/Debian:
```bash
sudo apt-get update && sudo apt-get install -y ffmpeg
```
2) Provide a Gemini API key:
Set env var `GEMINI_API_KEY` (recommended), or create a project `.env` and export it before running.
Example keys file: `{baseDir}/references/env.example`
## Quick start
Generate an MP3 from a single URL:
```bash
cd {baseDir}
export GEMINI_API_KEY="..."
./scripts/podcastfy_generate.py --url "https://example.com/article"
```
Multiple URLs:
```bash
cd {baseDir}
./scripts/podcastfy_generate.py --url "https://a" --url "https://b"
```
Long-form:
```bash
cd {baseDir}
./scripts/podcastfy_generate.py --url "https://example.com/long" --longform
```
## Output
The script writes outputs under:
- `{baseDir}/output/audio/` (MP3)
- `{baseDir}/output/transcripts/` (transcript)
Podcastfy prints the final MP3 path on success.
## Optional tuning
- `PODCASTFY_LLM_MODEL` (default: `gemini-1.5-flash`)
- `PODCASTFY_EDGE_VOICE_Q` (default: `en-US-JennyNeural`)
- `PODCASTFY_EDGE_VOICE_A` (default: `en-US-EricNeural`)
## Automation / reliability tips (important)
### Prefer RSS feeds over browser automation for cron jobs
In isolated cron jobs, avoid relying on a system browser (Chrome/Chromium) or extra Python deps (e.g. `bs4`).
For TechCrunch categories, prefer the RSS feed:
- https://techcrunch.com/category/artificial-intelligence/feed/
This reduces breakage from:
- “No supported browser found …”
- missing site parsing dependencies in the cron runtime
### Validate MP3 output (avoid 0-second audio)
We observed `podcastfy` can occasionally produce an MP3 that is effectively empty/truncated (e.g., a ~261-byte file), which shows up as **0s** in Telegram.
This wrapper now validates MP3 output and will automatically fall back to **edge-tts** synthesis from the latest transcript when the MP3 is invalid.
## Safety / workflow
- Prefer “draft-first”: tell the user what will be generated (language/length) before running.
- Never paste API keys into chat logs.Related Skills
clawdbot-update-plus
Full backup, update, and restore for Clawdbot - config, workspace, and skills with auto-rollback
clawdbot-ops
Use when starting, stopping, diagnosing, or troubleshooting Clawdbot. Includes gateway management, health checks, and common fixes.
clawdbot-dashboard
A beautiful, feature-rich dashboard for Clawdbot that displays workspace stats, memory, tasks, goals, analytics, and installed skills. Works with any Clawdbot installation - no database needed, all data from workspace files.
clawdbot-config
Comprehensive Clawdbot configuration and skills system management. Use when working with Clawdbot setup, configuration files, creating/modifying AgentSkills-compatible skills with YAML frontmatter, troubleshooting agent behavior, managing channels, workspace, sandbox, or multi-agent routing.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
privgraph
Privacy and security annotation for Mermaid diagrams. Use this skill when: (1) a user shares or creates a Mermaid flowchart or sequence diagram and asks for privacy, security, or compliance review, (2) a user asks to annotate a diagram with data classifications, controls, trust boundaries, or compliance scope, (3) a user mentions 'privgraph', '@pg:', or 'risk-informed diagram', (4) a user asks about data flow risks, missing controls, or GDPR/HIPAA/PCI compliance gaps in an architecture diagram, (5) a user is writing a design doc with Mermaid diagrams and wants privacy/security feedback embedded in the diagram itself. Do NOT use for general Mermaid syntax help, diagram styling, or non-security topics.
prisma-workflow
Prisma workflow for schema changes, migrations, and common pitfalls in this repo.
prisma-v7
Expert guidance for Prisma ORM v7 (7.0+). Use when working with Prisma schema files, migrations, Prisma Client queries, database setup, or when the user mentions Prisma, schema.prisma, @prisma/client, database models, or ORM. Covers ESM modules, driver adapters, prisma.config.ts, Rust-free client, and migration from v6.
prisma-orm
Type-safe database access with Prisma ORM. Covers schema design, migrations, relations, queries, and TypeScript integration. Use when working with Prisma, database modeling, or building type-safe data layers for Node.js/TypeScript projects.
prisma-expert
Prisma ORM expert for schema design, migrations, query optimization, relations modeling, and database operations. Use PROACTIVELY for Prisma schema issues, migration problems, query performance, relation design, or database connection issues.
prisma-docs
Local Prisma documentation reference. Use when asked about Prisma ORM, Prisma Client, Prisma Schema, migrations, database queries, Prisma Accelerate, or Prisma Postgres.
prisma-database-setup
Guides for configuring Prisma with different database providers (PostgreSQL, MySQL, SQLite, MongoDB, etc.). Use when setting up a new project, changing databases, or troubleshooting connection issues. Triggers on "configure postgres", "connect to mysql", "setup mongodb", "sqlite setup".