pipecat-friday-agent

Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI.

31,392 stars

Best use case

pipecat-friday-agent is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI.

Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "pipecat-friday-agent" skill to help with this workflow task. Context: Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/pipecat-friday-agent/SKILL.md --create-dirs "https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/plugins/antigravity-awesome-skills-claude/skills/pipecat-friday-agent/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/pipecat-friday-agent/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How pipecat-friday-agent Compares

Feature / Agent	pipecat-friday-agent	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

Best AI Skills for ChatGPT

Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# Pipecat Friday Agent

## Overview

This skill provides a blueprint for building **F.R.I.D.A.Y.** (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the **Pipecat** framework to orchestrate a low-latency pipeline:
- **STT**: OpenAI Whisper (`whisper-1`) or `gpt-4o-transcribe`
- **LLM**: Google Gemini 2.5 Flash (via a compatibility shim)
- **TTS**: OpenAI TTS (`nova` voice)
- **Transport**: Local Audio (Hardware Mic/Speakers)

## When to Use This Skill

- Use when you want to build a real-time, conversational voice agent.
- Use when working with the Pipecat framework for pipeline-based AI.
- Use when you need to integrate multiple providers (Google and OpenAI) into a single voice loop.
- Use when building Iron Man-themed or tactical-themed voice applications.

## How It Works

### Step 1: Install Dependencies

You will need the Pipecat framework and its service providers installed:
```bash
pip install pipecat-ai[openai,google,silero] python-dotenv
```

### Step 2: Configure Environment

Create a `.env` file with your API keys:
```env
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
```

### Step 3: Run the Agent

Execute the provided Python script to start the interface:
```bash
python scripts/friday_agent.py
```

## Core Concepts

### Pipeline Architecture
The agent follows a linear pipeline: `Mic -> VAD -> STT -> LLM -> TTS -> Speaker`. This allows for granular control over each stage, unlike end-to-end speech-to-speech models.

### Google Compatibility Shim
Since Google's Gemini API has a different message format than OpenAI's standard (which Pipecat aggregators expect), the script includes a `GoogleSafeContext` and `GoogleSafeMessage` class to bridge the gap.

## Best Practices

- ✅ **Use Silero VAD**: It is robust for local hardware and prevents background noise from triggering the LLM.
- ✅ **Concise Prompts**: Tactical agents should give short, data-dense responses to minimize latency.
- ✅ **Sample Rate Match**: OpenAI TTS outputs at 24kHz; ensure your `audio_out_sample_rate` matches to avoid high-pitched or slowed audio.
- ❌ **No Polite Fillers**: Avoid "Hello, how can I help you today?" Instead, use "Systems nominal. Ready for commands."

## Troubleshooting

- **Problem:** Audio is choppy or delayed.
- **Solution:** Check your `OUTPUT_DEVICE` index. Run a script like `test_audio_output.py` to find the correct hardware index for your OS.
- **Problem:** "Validation error" for message format.
- **Solution:** Ensure the `GoogleSafeContext` shim is correctly translating OpenAI-style dicts to Gemini-style schema.

## Related Skills

- `@voice-agents` - General principles of voice AI.
- `@agent-tool-builder` - Add tools (Search, Lights, etc.) to your Friday agent.
- `@llm-architect` - Optimizing the LLM layer.

## Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.

Related Skills

nextjs-best-practices

31392

from sickn33/antigravity-awesome-skills

Next.js App Router principles. Server Components, data fetching, routing patterns.

network-101

31392

from sickn33/antigravity-awesome-skills

Configure and test common network services (HTTP, HTTPS, SNMP, SMB) for penetration testing lab environments. Enable hands-on practice with service enumeration, log analysis, and security testing against properly configured target systems.

neon-postgres

31392

from sickn33/antigravity-awesome-skills

Expert patterns for Neon serverless Postgres, branching, connection pooling, and Prisma/Drizzle integration

nanobanana-ppt-skills

31392

from sickn33/antigravity-awesome-skills

AI-powered PPT generation with document analysis and styled images

multi-agent-patterns

31392

from sickn33/antigravity-awesome-skills

This skill should be used when the user asks to "design multi-agent system", "implement supervisor pattern", "create swarm architecture", "coordinate multiple agents", or mentions multi-agent patterns, context isolation, agent handoffs, sub-agents, or parallel agent execution.

monorepo-management

31392

from sickn33/antigravity-awesome-skills

Build efficient, scalable monorepos that enable code sharing, consistent tooling, and atomic changes across multiple packages and applications.

monetization

31392

from sickn33/antigravity-awesome-skills

Estrategia e implementacao de monetizacao para produtos digitais - Stripe, subscriptions, pricing experiments, freemium, upgrade flows, churn prevention, revenue optimization e modelos de negocio SaaS.

modern-javascript-patterns

31392

from sickn33/antigravity-awesome-skills

Comprehensive guide for mastering modern JavaScript (ES6+) features, functional programming patterns, and best practices for writing clean, maintainable, and performant code.

microservices-patterns

31392

from sickn33/antigravity-awesome-skills

Master microservices architecture patterns including service boundaries, inter-service communication, data management, and resilience patterns for building distributed systems.

mcp-builder

31392

from sickn33/antigravity-awesome-skills

Create MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks.

makepad-skills

31392

from sickn33/antigravity-awesome-skills

Makepad UI development skills for Rust apps: setup, patterns, shaders, packaging, and troubleshooting.

m365-agents-py

31392

from sickn33/antigravity-awesome-skills

Microsoft 365 Agents SDK for Python. Build multichannel agents for Teams/M365/Copilot Studio with aiohttp hosting, AgentApplication routing, streaming responses, and MSAL-based auth.