siliconflow-media

SiliconFlow 多模态服务，支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

siliconflow-media is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

SiliconFlow 多模态服务，支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。

Teams using siliconflow-media should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/siliconflow-media/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/axdlee/siliconflow-media/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/siliconflow-media/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How siliconflow-media Compares

Feature / Agent	siliconflow-media	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

SiliconFlow 多模态服务，支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# SiliconFlow 媒体服务

SiliconFlow 提供丰富的 AI 模型服务，支持代金券支付（当前余额 3000+）。

## 环境变量

- `SILICONFLOW_API_KEY` - SiliconFlow API Key

## 🎨 图片生成

```bash
uv run {baseDir}/scripts/image_gen.py --prompt "描述" --filename "output.png" [--model MODEL]
```

**可用模型**：
| 参数 | 模型 | 说明 |
|------|------|------|
| `flux` (默认) | FLUX.1-schnell | 快速高质量，约 5-10 秒 |
| `flux-dev` | FLUX.1-dev | 开发版 |
| `flux-pro` | FLUX.1-pro | 专业版 |
| `qwen` | Qwen/Qwen-Image | 通义图片生成 |
| `qwen-edit` | Qwen/Qwen-Image-Edit | 图片编辑 |
| `qwen-edit-2509` | Qwen/Qwen-Image-Edit-2509 | 最新编辑版 |

**示例**：
```bash
# FLUX 快速生成
uv run {baseDir}/scripts/image_gen.py --prompt "a cute robot assistant" --filename "robot.png"

# Qwen 生成
uv run {baseDir}/scripts/image_gen.py --prompt "山水画" --filename "landscape.png" --model qwen
```

## 🎬 视频生成

```bash
# 文生视频
uv run {baseDir}/scripts/video_gen.py --prompt "描述" --filename "output.mp4"

# 图生视频
uv run {baseDir}/scripts/video_gen.py --prompt "描述" --image "input.png" --filename "output.mp4"
```

**模型**：
- 文生视频: `Wan-AI/Wan2.2-T2V-A14B`
- 图生视频: `Wan-AI/Wan2.2-I2V-A14B`

⚠️ 视频生成时间较长（约 2-5 分钟）

## 🎤 语音合成 (TTS)

```bash
uv run {baseDir}/scripts/tts.py --text "要合成的文字" --filename "output.mp3" [--model MODEL]
```

**可用模型**：
| 参数 | 模型 | 说明 |
|------|------|------|
| `fish-speech` (默认) | fish-speech-1.5 | Fish Audio 高质量 |
| `cosyvoice` | CosyVoice2-0.5B | 阿里语音克隆 |
| `indextts` | IndexTTS-2 | Index TTS |
| `moss` | MOSS-TTSD-v0.5 | MOSS 多语言 |

**示例**：
```bash
uv run {baseDir}/scripts/tts.py --text "你好世界" --filename "hello.mp3"
```

## 👂 语音识别 (ASR)

```bash
uv run {baseDir}/scripts/asr.py --audio "input.mp3" [--model MODEL]
```

**可用模型**：
| 参数 | 模型 | 说明 |
|------|------|------|
| `sensevoice` (默认) | SenseVoiceSmall | 阿里语音识别 |
| `teleai` | TeleSpeechASR | TeleAI 识别 |

**示例**：
```bash
uv run {baseDir}/scripts/asr.py --audio "recording.mp3"
```

## 注意事项

1. ✅ 费用从代金券扣除，无需额外付费
2. ⏱️ 图片生成约 5-10 秒
3. ⏱️ 视频生成约 2-5 分钟（耐心等待）
4. 📝 所有脚本会打印 `MEDIA:` 行用于自动附加文件

Related Skills

openclaw-media-gen

3891

from openclaw/skills

Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.

Content & Documentation

media-compress

3891

from openclaw/skills

Compress and convert images and videos using ffmpeg. Use when the user wants to reduce file size, change format, resize, or optimize media files. Handles common formats like JPG, PNG, WebP, MP4, MOV, WebM. Triggers on phrases like "compress image", "compress video", "reduce file size", "convert to webp/mp4", "resize image", "make image smaller", "batch compress", "optimize media".

General Utilities

cliproxy-media

3891

from openclaw/skills

Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use this skill whenever you need to analyze, describe, or extract information from an image or photo ("analyze image", "describe photo", "what is in this picture"), read or summarize a PDF document ("read PDF", "summary of this document"), or process any media file via a CLIProxy-compatible endpoint ("process media via proxy", "cliproxy vision", "cliproxy media"). NEVER use the built-in `image` or `pdf` tools when using CLIProxyAPI — they fall back to direct Anthropic API which requires separate credits. Use this skill instead for all vision and document analysis tasks.

social-media-agent

3891

from openclaw/skills

Automated social media manager — plan, write, schedule, and analyze content across X/Twitter, LinkedIn, Instagram, TikTok, Facebook, and Pinterest. Integrates with Buffer (free) or Postiz (self-hosted) for scheduling.

social-media-content-scraper-pro

3891

from openclaw/skills

Social Media Content Bulk Scraper, extract articles/posts from WeChat, Instagram, TikTok, YouTube, export to Markdown/HTML with full metadata. $0.005 USDT per use.

Macrocosmos SN13 API - Social Media Data Skill

3891

from openclaw/skills

Fetch real-time social media data from X (Twitter) and Reddit by keyword, username, date range, and filters with engagement metrics via Macrocosmos SN13 API on Bittensor.

muapi-media-generation

3891

from openclaw/skills

Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5

muapi-media-editing

3891

from openclaw/skills

Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more

media-writing

3891

from openclaw/skills

You are a professional media writing expert with extensive experience in creating engaging and impactful content across multiple formats. Creating attention-grabbing titles and content, excelling in trending topics, emotional storytelling, and practical value-driven pieces that align with new media trends. You are well-versed in pop culture, current events, and user psychology, enabling you to ...

social-media-analyzer

3891

from openclaw/skills

Social media campaign analysis and performance tracking. Calculates engagement rates, ROI, and benchmarks across platforms. Use for analyzing social media performance, calculating engagement rate, measuring campaign ROI, comparing platform metrics, or benchmarking against industry standards.

social-media-manager

3891

from openclaw/skills

When the user wants to develop social media strategy, plan content calendars, manage community engagement, or grow their social presence across platforms. Also use when the user mentions 'social media strategy,' 'social calendar,' 'community management,' 'social media plan,' 'grow followers,' 'engagement rate,' 'social media audit,' or 'which platforms should I use.' For writing individual social posts, see social-content. For analyzing social performance data, see social-media-analyzer.

plexctl — Plex Media Server Control

3891

from openclaw/skills

> Standalone CLI for controlling Plex Media Server and clients via the Plex API