gemini-system
Gemini CLI specialized for multimodal file processing only. MUST use when PDF, video, audio, or image files need content extraction. Auto-triggers: file extensions .pdf, .mp4, .mov, .mp3, .wav, .m4a. For research/codebase analysis → use general-purpose subagent (Opus) instead. Planning/design → use Codex instead.
Best use case
gemini-system is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Gemini CLI specialized for multimodal file processing only. MUST use when PDF, video, audio, or image files need content extraction. Auto-triggers: file extensions .pdf, .mp4, .mov, .mp3, .wav, .m4a. For research/codebase analysis → use general-purpose subagent (Opus) instead. Planning/design → use Codex instead.
Teams using gemini-system should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/gemini-system/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How gemini-system Compares
| Feature / Agent | gemini-system | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Gemini CLI specialized for multimodal file processing only. MUST use when PDF, video, audio, or image files need content extraction. Auto-triggers: file extensions .pdf, .mp4, .mov, .mp3, .wav, .m4a. For research/codebase analysis → use general-purpose subagent (Opus) instead. Planning/design → use Codex instead.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Gemini System — Multimodal File Processing
**Gemini CLI is specialized for multimodal file processing (PDF, video, audio, image).**
> **Detailed rules**: `.claude/rules/gemini-delegation.md`
> **Research/codebase analysis**: Use general-purpose subagent (Opus) instead — Opus now supports 1M context.
## Multimodal File Processing
Extract content from PDF, video, audio, and image files.
```bash
# PDF
gemini -p "Extract: {what to extract} @/path/to/file.pdf" 2>/dev/null
# Video
gemini -p "Summarize: key concepts, timestamps @/path/to/video.mp4" 2>/dev/null
# Audio
gemini -p "Transcribe and summarize: decisions, action items @/path/to/audio.mp3" 2>/dev/null
# Image (diagrams, charts)
gemini -p "Analyze: components, relationships, data flow @/path/to/diagram.png" 2>/dev/null
```
| Target | Extensions |
|--------|------------|
| PDF | `.pdf` |
| Video | `.mp4`, `.mov`, `.avi`, `.mkv`, `.webm` |
| Audio | `.mp3`, `.wav`, `.m4a`, `.flac`, `.ogg` |
| Images (advanced analysis) | `.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`, `.svg` |
> Simple screenshot inspection can be done directly with Claude's Read tool.
## Auto-Trigger
When multimodal files appear in a task, automatically pass them to Gemini without waiting for user instructions.
## When NOT to Use Gemini
| Task | Correct Owner |
|------|---------------|
| Research and investigation | **general-purpose subagent** (Opus) |
| Codebase analysis | **general-purpose subagent** (Opus) |
| Design and planning | **Codex** |
| Debugging | **Codex** |
| Code implementation | **Claude / Subagents** |
## How to Use
### Subagent Pattern (for large outputs)
```
Task tool parameters:
- subagent_type: "gemini-explore"
- prompt: |
{task description}
gemini -p "{prompt} @/path/to/file" 2>/dev/null
Return CONCISE summary (5-7 bullet points).
```
### Direct Call (for short extractions)
```bash
gemini -p "{what to extract} @/path/to/file" 2>/dev/null
```
## Language Protocol
1. Ask Gemini in **English**
2. Receive response in **English**
3. Report to user in **the user's language**Related Skills
codex-system
Codex CLI handles planning, design, and complex code implementation. Use for: architecture design, implementation planning, complex algorithms, debugging (root cause analysis), trade-off evaluation, code review. External research is NOT Codex's job — use general-purpose subagent (Opus) instead. Explicit triggers: "plan", "design", "architecture", "think deeper", "analyze", "debug", "complex", "optimize".
context-loader
ALWAYS activate this skill at the start of every task. Load project context from .claude/ directory including coding rules, design decisions, and documentation before executing any task.
update-lib-docs
Update library documentation in .claude/docs/libraries/ with latest information from web search.
update-design
Explicitly update DESIGN.md with decisions from the current conversation. Use when you want to force a design document update.
troubleshoot
Diagnose and plan fixes for errors/bugs with Codex-first multi-agent collaboration (Codex + Opus 4.6 + Agent Teams). Codex CLI is consulted in EVERY phase for deep code reasoning, hypothesis evaluation, and fix validation. Phase 1: Error reproduction & context gathering (Opus subagent 1M context + Codex initial analysis + Claude user interaction). Phase 2: Parallel diagnosis (Agent Teams: Root Cause Analyst [Codex-driven] + Impact Investigator [Opus + Codex risk analysis]). Phase 3: Fix plan synthesis, Codex validation & user approval. Fix implementation is handled separately by /team-implement.
team-review
Parallel code review using Agent Teams. Spawns specialized reviewers (security, quality, test coverage) to review implementation from different perspectives simultaneously. Run after implementation.
team-implement
Parallel implementation using Agent Teams. Spawns teammates per module/layer, each owning separate files to avoid conflicts. Uses shared task list with dependencies for autonomous coordination. Run after /start-feature plan approval.
tdd
Implement features using Test-Driven Development (TDD) with Red-Green-Refactor cycle.
start-feature
Start a new feature with multi-agent collaboration (Opus 4.6 + Agent Teams). Phase 1: Codebase understanding (Opus subagent 1M context + Claude user interaction). Phase 2: Parallel research & design (Agent Teams: Researcher + Architect). Phase 3: Plan synthesis & user approval. Implementation is handled separately by /team-implement.
spike
Time-boxed technical investigation/feasibility study with Codex-first multi-agent collaboration (Codex + Opus 4.6 + Agent Teams). Codex CLI is consulted in EVERY phase for question framing, feasibility analysis, and final evaluation. Phase 1: Frame the investigation question & constraints (Claude user interaction + Codex question decomposition). Phase 2: Parallel investigation (Agent Teams: Researcher [Opus external research] + Feasibility Analyst [Codex deep analysis] + optional prototype). Phase 3: Codex synthesis into go/no-go recommendation & research report. Produces a DECISION DOCUMENT, NOT an implementation plan. Use /add-feature or /start-feature after a GO decision.
simplify
Simplify and refactor code while preserving functionality and library constraints.
research-lib
Research a library and create comprehensive documentation in .claude/docs/libraries/.