cliproxy-media
Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use this skill whenever you need to analyze, describe, or extract information from an image or photo ("analyze image", "describe photo", "what is in this picture"), read or summarize a PDF document ("read PDF", "summary of this document"), or process any media file via a CLIProxy-compatible endpoint ("process media via proxy", "cliproxy vision", "cliproxy media"). NEVER use the built-in `image` or `pdf` tools when using CLIProxyAPI — they fall back to direct Anthropic API which requires separate credits. Use this skill instead for all vision and document analysis tasks.
Best use case
cliproxy-media is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use this skill whenever you need to analyze, describe, or extract information from an image or photo ("analyze image", "describe photo", "what is in this picture"), read or summarize a PDF document ("read PDF", "summary of this document"), or process any media file via a CLIProxy-compatible endpoint ("process media via proxy", "cliproxy vision", "cliproxy media"). NEVER use the built-in `image` or `pdf` tools when using CLIProxyAPI — they fall back to direct Anthropic API which requires separate credits. Use this skill instead for all vision and document analysis tasks.
Teams using cliproxy-media should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/cliproxy-media/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How cliproxy-media Compares
| Feature / Agent | cliproxy-media | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use this skill whenever you need to analyze, describe, or extract information from an image or photo ("analyze image", "describe photo", "what is in this picture"), read or summarize a PDF document ("read PDF", "summary of this document"), or process any media file via a CLIProxy-compatible endpoint ("process media via proxy", "cliproxy vision", "cliproxy media"). NEVER use the built-in `image` or `pdf` tools when using CLIProxyAPI — they fall back to direct Anthropic API which requires separate credits. Use this skill instead for all vision and document analysis tasks.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
Top AI Agents for Productivity
See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
SKILL.md Source
# cliproxy-media
**Source:** https://github.com/bencoremans/site/tree/main/skills/cliproxy-media
Analyze images and PDFs via CLIProxyAPI (Claude Max subscription, zero extra cost).
## Setup
Set the endpoint to your CLIProxy instance:
```bash
export CLIPROXY_URL=http://your-host:8317/v1/messages
```
For Docker setups, replace `your-host` with your container hostname (e.g. `cliproxyapi`, `localhost`, or the container IP).
## Quick start
```bash
# Analyze an image
python3 skills/cliproxy-media/scripts/analyze.py /path/to/image.jpg "What is in this image?"
# Read a PDF
python3 skills/cliproxy-media/scripts/analyze.py /path/to/document.pdf "Give a summary"
# Compare multiple images
python3 skills/cliproxy-media/scripts/analyze.py img1.jpg img2.jpg "Compare these images"
# With streaming (output appears immediately)
python3 skills/cliproxy-media/scripts/analyze.py --stream image.jpg "Describe in detail"
# With system prompt
python3 skills/cliproxy-media/scripts/analyze.py --system "You are a medical expert" scan.jpg "What do you see?"
# With higher token limit
python3 skills/cliproxy-media/scripts/analyze.py --max-tokens 4096 document.pdf "Extensive analysis"
```
## What works ✅ / What doesn't ❌
### ✅ Supported file types
| Type | Format | Note |
|------|--------|------|
| Image | `.jpg` / `.jpeg` | Requires valid JPEG data |
| Image | `.png` | Fully supported |
| Image | `.gif` | Fully supported |
| Image | `.webp` | Fully supported |
| Document | `.pdf` | Base64-encoded, via `document` content type |
| Image via URL | `http://` / `https://` | Direct URL reference, no download needed |
**Multiple files at once:** Provide multiple paths before the question. Max ~100 per request (Anthropic limit).
### ❌ Not supported
- **Office files** (`.docx`, `.xlsx`, `.pptx`) — Workaround: convert to PDF
- **Audio** (`.mp3`, `.wav`, `.ogg`) — Use Whisper for transcription
- **Video** (`.mp4`, `.mov`, `.avi`) — Not supported by the model
- **Other document types** (`.txt`, `.html`, `.md` as document) — Send text directly as a string
## ⚠️ System prompt warning
CLIProxyAPI accepts **only** the array notation for system prompts. The string notation is **silently ignored** — the model does not see it, but you also won't get an error message!
```python
# ❌ DOES NOT WORK — ignored without error message
payload["system"] = "You are an expert."
# ✅ WORKS — always use array notation
payload["system"] = [{"type": "text", "text": "You are an expert."}]
```
The `--system` argument in `analyze.py` automatically uses the correct array notation.
## Configuration (env vars)
| Variable | Default | Description |
|----------|---------|-------------|
| `CLIPROXY_URL` | `http://localhost:8317/v1/messages` | Full endpoint URL |
| `CLIPROXY_MODEL` | `claude-sonnet-4-6` | Model to use |
Example:
```bash
export CLIPROXY_URL=http://localhost:8317/v1/messages
export CLIPROXY_MODEL=claude-opus-4-6
python3 skills/cliproxy-media/scripts/analyze.py image.jpg "question"
```
## Additional options
```
--stream Streaming output via SSE (output appears immediately)
--system TEXT System prompt (automatically sent as array)
--max-tokens N Maximum output tokens (default: 1024)
--model MODEL Model override (overrides CLIPROXY_MODEL)
--url URL Endpoint override (overrides CLIPROXY_URL)
```
## Compatibility
This script works with any API that supports the Anthropic Messages format:
| Provider | Compatible | Note |
|----------|-----------|------|
| **CLIProxyAPI** | ✅ Yes | Primarily tested, system prompt array required |
| **OpenRouter** | ✅ Yes | Use Bearer token instead of `x-api-key: dummy` |
| **LiteLLM** | ✅ Yes | As proxy for Anthropic format |
| **Anthropic direct** | ✅ Yes | Use `ANTHROPIC_API_KEY` as x-api-key |
**Note for non-CLIProxy endpoints:** Some proxies do accept string notation for system prompts. Always use array notation for maximum compatibility.
## Known limitations of CLIProxyAPI
- `temperature` and `top_p` may **not** be used at the same time (HTTP 400)
- PDF as document with URL source does not work (`Unable to download the file`)
- Only `claude-sonnet-4-6` and `claude-opus-4-6` available (haiku is deprecated)
- `inference_geo` is always `not_available` in the response
## Direct Python API
If you want to call the script from your own Python code:
```python
import subprocess, json
result = subprocess.run(
["python3", "skills/cliproxy-media/scripts/analyze.py", "image.jpg", "Describe this"],
capture_output=True, text=True
)
print(result.stdout)
```
Or use the built-in exec tool:
```
exec: python3 skills/cliproxy-media/scripts/analyze.py /path/to/image.jpg "question"
```Related Skills
openclaw-media-gen
Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.
media-compress
Compress and convert images and videos using ffmpeg. Use when the user wants to reduce file size, change format, resize, or optimize media files. Handles common formats like JPG, PNG, WebP, MP4, MOV, WebM. Triggers on phrases like "compress image", "compress video", "reduce file size", "convert to webp/mp4", "resize image", "make image smaller", "batch compress", "optimize media".
cpa-codex-auth-sweep-cliproxy
通过 CLI Proxy Management API 拉取 Codex 认证文件并高并发探活扫描。适用于「扫号」「清死号」「清理 Codex 401」场景;仅在用户明确确认后可删除 401。执行前必须提供 base_url 与 management_key。安全限制:默认仅允许 https://chatgpt.com 作为 probe 主机,非白名单目标需显式危险确认。
social-media-agent
Automated social media manager — plan, write, schedule, and analyze content across X/Twitter, LinkedIn, Instagram, TikTok, Facebook, and Pinterest. Integrates with Buffer (free) or Postiz (self-hosted) for scheduling.
social-media-content-scraper-pro
Social Media Content Bulk Scraper, extract articles/posts from WeChat, Instagram, TikTok, YouTube, export to Markdown/HTML with full metadata. $0.005 USDT per use.
cliproxy-openclaw
Deploy and configure CLIProxyAPI, expose its dashboard safely, connect OAuth providers like Claude Code, Gemini, Codex, Qwen, and iFlow, generate a reusable API endpoint and API key, and integrate it with OpenClaw or other OpenAI-compatible tools. Use when the user wants one API layer from subscription-based CLI or OAuth accounts, multi-account routing, or CLIProxy setup on a VPS or local machine.
siliconflow-media
SiliconFlow 多模态服务,支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。
Macrocosmos SN13 API - Social Media Data Skill
Fetch real-time social media data from X (Twitter) and Reddit by keyword, username, date range, and filters with engagement metrics via Macrocosmos SN13 API on Bittensor.
muapi-media-generation
Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5
muapi-media-editing
Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more
media-writing
You are a professional media writing expert with extensive experience in creating engaging and impactful content across multiple formats. Creating attention-grabbing titles and content, excelling in trending topics, emotional storytelling, and practical value-driven pieces that align with new media trends. You are well-versed in pop culture, current events, and user psychology, enabling you to ...
social-media-analyzer
Social media campaign analysis and performance tracking. Calculates engagement rates, ROI, and benchmarks across platforms. Use for analyzing social media performance, calculating engagement rate, measuring campaign ROI, comparing platform metrics, or benchmarking against industry standards.