podcastfy-clawdbot

Generate an AI podcast (MP3) from one or more URLs using the open-source Podcastfy project. Use when the user says “make a podcast from this URL/article/video/PDF”, “turn this webpage into a podcast”, or wants an MP3 conversation-style summary from links. Uses Gemini for transcript generation via GEMINI_API_KEY and Edge TTS for free voice.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

podcastfy-clawdbot is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using podcastfy-clawdbot should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/podcastfy-clawdbot/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/podcastfy-clawdbot/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/podcastfy-clawdbot/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How podcastfy-clawdbot Compares

Feature / Agent	podcastfy-clawdbot	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Podcastfy (Clawdbot)

Generate a podcast-style audio conversation (MP3) from a URL (or multiple URLs) using `podcastfy`.

This skill provides a wrapper script that:
- creates/uses a local venv (`{baseDir}/.venv`)
- installs/updates `podcastfy`
- runs Podcastfy with **Gemini** for transcript generation and **Edge** for TTS

## One-time setup

1) Ensure `ffmpeg` is installed on the host.

Ubuntu/Debian:
```bash
sudo apt-get update && sudo apt-get install -y ffmpeg
```

2) Provide a Gemini API key:

Set env var `GEMINI_API_KEY` (recommended), or create a project `.env` and export it before running.

Example keys file: `{baseDir}/references/env.example`

## Quick start

Generate an MP3 from a single URL:
```bash
cd {baseDir}
export GEMINI_API_KEY="..."
./scripts/podcastfy_generate.py --url "https://example.com/article"
```

Multiple URLs:
```bash
cd {baseDir}
./scripts/podcastfy_generate.py --url "https://a" --url "https://b"
```

Long-form:
```bash
cd {baseDir}
./scripts/podcastfy_generate.py --url "https://example.com/long" --longform
```

## Output

The script writes outputs under:
- `{baseDir}/output/audio/` (MP3)
- `{baseDir}/output/transcripts/` (transcript)

Podcastfy prints the final MP3 path on success.

## Optional tuning

- `PODCASTFY_LLM_MODEL` (default: `gemini-1.5-flash`)
- `PODCASTFY_EDGE_VOICE_Q` (default: `en-US-JennyNeural`)
- `PODCASTFY_EDGE_VOICE_A` (default: `en-US-EricNeural`)

## Automation / reliability tips (important)

### Prefer RSS feeds over browser automation for cron jobs

In isolated cron jobs, avoid relying on a system browser (Chrome/Chromium) or extra Python deps (e.g. `bs4`).
For TechCrunch categories, prefer the RSS feed:
- https://techcrunch.com/category/artificial-intelligence/feed/

This reduces breakage from:
- “No supported browser found …”
- missing site parsing dependencies in the cron runtime

### Validate MP3 output (avoid 0-second audio)

We observed `podcastfy` can occasionally produce an MP3 that is effectively empty/truncated (e.g., a ~261-byte file), which shows up as **0s** in Telegram.

This wrapper now validates MP3 output and will automatically fall back to **edge-tts** synthesis from the latest transcript when the MP3 is invalid.

## Safety / workflow

- Prefer “draft-first”: tell the user what will be generated (language/length) before running.
- Never paste API keys into chat logs.

Related Skills

clawdbot-update-plus

from diegosouzapw/awesome-omni-skill

Full backup, update, and restore for Clawdbot - config, workspace, and skills with auto-rollback

clawdbot-ops

from diegosouzapw/awesome-omni-skill

Use when starting, stopping, diagnosing, or troubleshooting Clawdbot. Includes gateway management, health checks, and common fixes.

clawdbot-dashboard

from diegosouzapw/awesome-omni-skill

A beautiful, feature-rich dashboard for Clawdbot that displays workspace stats, memory, tasks, goals, analytics, and installed skills. Works with any Clawdbot installation - no database needed, all data from workspace files.

clawdbot-config

from diegosouzapw/awesome-omni-skill

Comprehensive Clawdbot configuration and skills system management. Use when working with Clawdbot setup, configuration files, creating/modifying AgentSkills-compatible skills with YAML frontmatter, troubleshooting agent behavior, managing channels, workspace, sandbox, or multi-agent routing.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

privgraph

from diegosouzapw/awesome-omni-skill

Privacy and security annotation for Mermaid diagrams. Use this skill when: (1) a user shares or creates a Mermaid flowchart or sequence diagram and asks for privacy, security, or compliance review, (2) a user asks to annotate a diagram with data classifications, controls, trust boundaries, or compliance scope, (3) a user mentions 'privgraph', '@pg:', or 'risk-informed diagram', (4) a user asks about data flow risks, missing controls, or GDPR/HIPAA/PCI compliance gaps in an architecture diagram, (5) a user is writing a design doc with Mermaid diagrams and wants privacy/security feedback embedded in the diagram itself. Do NOT use for general Mermaid syntax help, diagram styling, or non-security topics.

prisma-workflow

from diegosouzapw/awesome-omni-skill

Prisma workflow for schema changes, migrations, and common pitfalls in this repo.

prisma-v7

from diegosouzapw/awesome-omni-skill

Expert guidance for Prisma ORM v7 (7.0+). Use when working with Prisma schema files, migrations, Prisma Client queries, database setup, or when the user mentions Prisma, schema.prisma, @prisma/client, database models, or ORM. Covers ESM modules, driver adapters, prisma.config.ts, Rust-free client, and migration from v6.

prisma-orm

from diegosouzapw/awesome-omni-skill

Type-safe database access with Prisma ORM. Covers schema design, migrations, relations, queries, and TypeScript integration. Use when working with Prisma, database modeling, or building type-safe data layers for Node.js/TypeScript projects.

prisma-expert

from diegosouzapw/awesome-omni-skill

Prisma ORM expert for schema design, migrations, query optimization, relations modeling, and database operations. Use PROACTIVELY for Prisma schema issues, migration problems, query performance, relation design, or database connection issues.

prisma-docs

from diegosouzapw/awesome-omni-skill

Local Prisma documentation reference. Use when asked about Prisma ORM, Prisma Client, Prisma Schema, migrations, database queries, Prisma Accelerate, or Prisma Postgres.

prisma-database-setup

from diegosouzapw/awesome-omni-skill

Guides for configuring Prisma with different database providers (PostgreSQL, MySQL, SQLite, MongoDB, etc.). Use when setting up a new project, changing databases, or troubleshooting connection issues. Triggers on "configure postgres", "connect to mysql", "setup mongodb", "sqlite setup".