multi-ai-research
Comprehensive research and analysis using Claude (subagents), Gemini CLI, and Codex CLI. Multi-perspective research with cross-verification, iterative refinement, and 100% citation coverage. Use for security analysis, architecture research, code quality assessment, performance analysis, or any research requiring rigorous verification and multiple AI perspectives.
Best use case
multi-ai-research is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Comprehensive research and analysis using Claude (subagents), Gemini CLI, and Codex CLI. Multi-perspective research with cross-verification, iterative refinement, and 100% citation coverage. Use for security analysis, architecture research, code quality assessment, performance analysis, or any research requiring rigorous verification and multiple AI perspectives.
Teams using multi-ai-research should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/multi-ai-research-adaptationio/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How multi-ai-research Compares
| Feature / Agent | multi-ai-research | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Comprehensive research and analysis using Claude (subagents), Gemini CLI, and Codex CLI. Multi-perspective research with cross-verification, iterative refinement, and 100% citation coverage. Use for security analysis, architecture research, code quality assessment, performance analysis, or any research requiring rigorous verification and multiple AI perspectives.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Multi-AI Research & Analysis
## Overview
Harnesses three AI systems (Claude via Task tool, Gemini CLI, Codex CLI) for comprehensive research and analysis with multi-perspective verification and iterative refinement.
**Purpose**: Produce analysis more thorough than any single AI could achieve through specialized roles, cross-validation, and systematic verification.
**Key Innovation**: Not just parallel execution - specialized research roles with cross-verification and iterative refinement until production-ready (quality ≥95/100, 100% citations, zero gaps).
**The 3 AI Systems**:
1. **Claude Subagents** (via Task tool) - Documentation, codebase analysis, synthesis
2. **Gemini CLI** - Web research, latest trends, community practices
3. **Codex CLI** - GitHub patterns, code examples, deep reasoning
**Quality Guarantees**:
- ✓ **100% coverage** - All objectives addressed, zero gaps
- ✓ **100% citations** - Every claim sourced (file:line or URL)
- ✓ **Multi-perspective** - 3 AI systems cross-validated
- ✓ **≥95/100 quality** - Verified through 3-pass system
- ✓ **Actionable** - Specific recommendations with examples
- ✓ **Resumable** - External memory enables multi-session work
---
## When to Use
Use this skill for:
### Security Analysis
- Authentication/authorization assessment
- Vulnerability identification
- Best practice validation
- OWASP Top 10 coverage
- Penetration testing preparation
### Architecture Analysis
- System design review
- Component mapping
- Integration pattern analysis
- Scalability assessment
- Technical debt evaluation
### Code Quality Analysis
- Pattern detection
- Code smell identification
- Complexity metrics
- Refactoring opportunities
- Best practice adherence
### Performance Analysis
- Bottleneck identification
- Algorithm complexity
- Resource usage patterns
- Optimization opportunities
- Benchmark analysis
### Research Synthesis
- Multi-source research compilation
- Best practice identification
- Technology evaluation
- Pattern discovery
- Trend analysis
### Comprehensive Reviews
- Pre-production audit
- System health check
- Compliance verification
- Documentation audit
- Knowledge transfer
---
## Quick Start
### Option 1: Automated Script
```bash
# Run complete analysis automatically
bash .claude/skills/multi-ai-research/scripts/analyze.sh "Security analysis of authentication system"
```
This will:
1. Create analysis plan
2. Launch parallel research (Claude + Gemini + Codex)
3. Perform deep analysis
4. Synthesize and verify
5. Iterate if needed
6. Generate final report
### Option 2: Interactive Mode
Ask Claude Code to use this skill:
```
"Use multi-ai-research to analyze [objective]"
```
Claude will:
1. Create comprehensive analysis plan
2. Coordinate all three AI systems
3. Synthesize findings
4. Verify quality
5. Iterate until ≥95 quality
6. Deliver final report
---
## The 5-Phase Pipeline
### Phase 1: Planning & Strategy
**Duration**: 5-10 minutes
**Output**: `.analysis/ANALYSIS_PLAN.md`
Claude creates comprehensive plan:
- Defines objectives and scope
- Plans file reading strategy (glob → grep → read)
- Assigns tasks to AI systems
- Sets verification criteria
- Defines success thresholds
### Phase 2: Parallel Research
**Duration**: 10-20 minutes
**Output**: `.analysis/research/*.md`
All three systems research simultaneously:
**Claude Subagent**:
- Official documentation analysis
- Codebase examination (progressive disclosure)
- Architecture mapping
- Pattern identification
**Gemini CLI**:
- Web research (latest 2024-2025)
- Community best practices
- Industry trends
- Common pitfalls
**Codex CLI**:
- GitHub pattern analysis
- Code examples from top repos
- Implementation references
- Testing strategies
### Phase 3: Deep Analysis
**Duration**: 15-30 minutes
**Output**: `.analysis/analysis/code-patterns.md`
Claude Analysis Agent with extended thinking:
- Progressive codebase analysis
- Pattern recognition across sources
- Architecture mapping
- Metrics calculation
- Risk assessment
### Phase 4: Synthesis & Verification
**Duration**: 10-20 minutes
**Outputs**:
- `.analysis/SYNTHESIS_REPORT.md`
- `.analysis/verification/cross-check.md`
**Synthesis** (Claude with extended thinking):
- Read all research findings
- Identify themes across sources
- Resolve contradictions
- Create unified narrative
- Full citations
**Verification** (Verification Subagent):
- 3-pass verification (completeness, accuracy, quality)
- Cross-source validation
- Citation checking
- Gap analysis
- Quality scoring
### Phase 5: Iteration (if needed)
**Duration**: 10-30 minutes
**Output**: `.analysis/iterations/ITERATION_2.md`
If quality <95 or gaps exist:
- Targeted research for gaps
- Quality improvements
- Re-verification
- Repeat until ≥95
### Phase 6: Final Report
**Duration**: 5-10 minutes
**Output**: `.analysis/ANALYSIS_FINAL.md`
Comprehensive final report:
- Executive summary
- Complete findings
- All sources synthesized
- Prioritized recommendations
- Implementation guidance
- Full citations
**Total Time**: 45-90 minutes for comprehensive analysis
---
## Analysis Types
### Security Analysis
**What it checks**:
- Authentication/authorization patterns
- Input validation
- Secret management
- Injection vulnerabilities (SQL, XSS, etc.)
- Dependency vulnerabilities
- Rate limiting
- Session security
**Example**:
```
Use multi-ai-research for "Security audit of authentication system"
```
**Output**:
- Critical/High/Medium/Low priority issues
- OWASP Top 10 coverage
- Code examples with file:line
- Specific remediation steps
- Industry best practices comparison
### Architecture Analysis
**What it examines**:
- System components and boundaries
- Integration patterns
- Data flow
- Dependency relationships
- Scalability considerations
- Design patterns used
**Example**:
```
Use multi-ai-research for "Architecture analysis of microservices system"
```
**Output**:
- Component map with relationships
- Integration pattern analysis
- Scalability assessment
- Technical debt identification
- Refactoring recommendations
### Code Quality Analysis
**What it analyzes**:
- Code patterns and organization
- Complexity metrics
- Code smells
- Best practice adherence
- Test coverage
- Documentation quality
**Example**:
```
Use multi-ai-research for "Code quality assessment for ./src"
```
**Output**:
- Quality score with breakdown
- Pattern analysis
- Refactoring priorities
- Specific code improvements
- Complexity hotspots
### Performance Analysis
**What it identifies**:
- Algorithm complexity
- Bottlenecks
- Resource usage patterns
- Database query efficiency
- Network call patterns
**Example**:
```
Use multi-ai-research for "Performance bottleneck identification"
```
**Output**:
- Bottleneck analysis with file:line
- Optimization opportunities
- Before/after estimations
- Implementation guidance
### Research Synthesis
**What it compiles**:
- Official documentation
- Web best practices
- GitHub patterns
- Industry standards
- Community insights
**Example**:
```
Use multi-ai-research for "Research GraphQL federation patterns 2024-2025"
```
**Output**:
- Multi-source synthesis
- Consensus findings (all sources agree)
- Multiple perspectives (sources differ)
- Code examples
- Implementation recommendations
---
## How It Works
### Progressive Disclosure
**Never reads files blindly**. Always uses 3-level approach:
**Level 1: Metadata (glob)** - ~50 tokens
```bash
glob "**/*.{ts,js,py}" # Understand structure
glob "**/*.md" # Find documentation
glob "**/package.json" # Check dependencies
```
**Level 2: Patterns (grep)** - ~5k tokens
```bash
grep "export class|interface" --glob "**/*.ts"
grep "TODO|FIXME|BUG" --glob "**/*"
grep "password|secret|token" --glob "**/*.ts"
```
**Level 3: Reading (read)** - ~50k tokens
```bash
read "src/auth/login.ts" # Only critical files
read "docs/architecture.md"
```
**Result**: 90%+ reduction in unnecessary file reads
### External Memory Architecture
All state saved to files, not context:
```
.analysis/
├── ANALYSIS_PLAN.md # Strategy and assignments
├── research/
│ ├── claude-docs.md # Claude research
│ ├── gemini-web.md # Gemini research
│ └── codex-github.md # Codex research
├── analysis/
│ ├── code-patterns.md # Pattern analysis
│ └── architecture-map.md # System map
├── verification/
│ └── cross-check.md # Verification results
├── iterations/
│ ├── ITERATION_1.md # First pass
│ └── ITERATION_2.md # Gap fills
└── ANALYSIS_FINAL.md # Complete report
```
**Benefits**:
- Survives context window limits
- Enables multi-session analysis
- Resumable from any checkpoint
- No information loss
### Cross-Validation Pattern
**High Confidence** (★★★★★): All 3 sources agree + code verification
**Medium Confidence** (★★★☆☆): 2/3 sources agree
**Requires Investigation** (★★☆☆☆): Sources conflict
**Example**:
```markdown
## JWT Implementation (High Confidence ★★★★★)
**Claude**: "Uses JWT with HS256" (src/auth/jwt.ts:15)
**Gemini**: "HS256 is industry standard 2024" (URL)
**Codex**: "150+ repos use HS256 pattern" (GitHub)
**Code**: Verified at src/auth/jwt.ts:18-22
**Recommendation**: Implementation correct per standards
```
### Quality Scoring
**Comprehensive rubric** (0-100):
- **Comprehensiveness** (/20): All aspects covered
- **Accuracy** (/20): All claims sourced and verified
- **Specificity** (/20): File:line precision, not vague
- **Actionability** (/20): Specific recommendations
- **Consistency** (/20): No contradictions
**Quality Gates**:
- ≥95: Production-ready
- 85-94: Needs minor refinement
- 75-84: Needs iteration
- <75: Requires rework
### Iterative Refinement
**Iteration 1** (Breadth): Broad coverage, identifies gaps
**Iteration 2** (Depth): Fill gaps, improve quality
**Iteration 3** (Polish): Final verification, perfection
**Automatic iteration** until:
- Quality ≥95
- Citation coverage = 100%
- Critical gaps = 0
---
## AI System Roles
### Claude Subagents (via Task tool)
**Research Agent** (Haiku):
- Progressive disclosure expert
- Documentation analysis
- Codebase examination
- Pattern detection
**Analysis Agent** (Sonnet):
- Extended thinking for synthesis
- Multi-source integration
- Pattern recognition
- Architectural insights
**Verification Agent** (Haiku):
- 3-pass verification
- Citation checking
- Gap analysis
- Quality scoring
### Gemini CLI
**Strengths**:
- Native web search
- Latest trends (2024-2025)
- Community practices
- Multimodal analysis (if needed)
**Use for**:
- Best practice research
- Industry standards
- Latest vulnerabilities
- Framework comparisons
### Codex CLI
**Strengths**:
- GitHub integration
- Code pattern search
- Deep reasoning (o3 model)
- Implementation examples
**Use for**:
- Code examples
- Design patterns
- Architecture reasoning
- Testing strategies
---
## Configuration
### Prerequisites
**Required**:
- Claude Code (with Task tool access)
**Optional but Recommended**:
- Gemini CLI: `npm install -g @google/gemini-cli`
- Codex CLI: `npm install -g @openai/codex`
**Note**: Skill works with Claude-only fallback if Gemini/Codex unavailable.
### Gemini CLI Setup
```bash
# Install
npm install -g @google/gemini-cli
# Authenticate (OAuth - free)
gemini
# Follow browser authentication
# Test
gemini -p "test prompt"
```
### Codex CLI Setup
```bash
# Install
npm install -g @openai/codex
# Authenticate (ChatGPT Plus/Pro account)
codex login
# Follow browser authentication
# Test
codex exec "test prompt"
```
### Model Selection
**Claude**:
- Haiku: Research & verification (fast, efficient)
- Sonnet: Analysis & synthesis (balanced)
- Opus: Complex reasoning (if needed)
**Gemini**:
- gemini-2.5-flash: Quick research
- gemini-2.5-pro: Complex analysis
**Codex**:
- gpt-5.1-codex: Standard tasks
- o3: Deep architectural reasoning
- o4-mini: Quick operations
---
## Examples
### Example 1: Security Analysis
```
Objective: "Security audit of authentication system"
Phase 2 - Parallel Research:
├─ Claude: Analyzes src/auth/* for patterns
├─ Gemini: Researches "OAuth 2.0 security best practices 2024"
└─ Codex: Finds GitHub examples of secure auth
Phase 3 - Analysis:
└─ Claude: Identifies 3 critical, 5 high priority issues
Phase 4 - Synthesis:
└─ All agree: Missing rate limiting (CRITICAL)
- Claude: No rate limit found in src/auth/login.ts
- Gemini: OWASP recommends max 5 attempts/hour
- Codex: 150+ repos use express-rate-limit
- Recommendation: Implement with Redis backend
Final Report:
├─ Executive summary
├─ 8 issues (3 critical, 5 high) with fixes
├─ OWASP Top 10 coverage
├─ Specific code examples
└─ Priority implementation plan
Quality: 97/100 ✓
```
### Example 2: Architecture Analysis
```
Objective: "Analyze microservices architecture"
Phase 2:
├─ Claude: Maps services via glob + grep
├─ Gemini: Researches microservices patterns 2024
└─ Codex: Finds service mesh examples
Phase 3:
└─ Claude: Identifies 7 services, 12 integration points
Phase 4:
└─ Synthesis: Service communication patterns
- Consensus: REST for external, gRPC for internal
- Trade-offs documented
- Scaling strategies from Codex examples
Final Report:
├─ Component map (7 services, dependencies)
├─ Integration analysis (12 patterns)
├─ Scalability assessment
└─ Modernization recommendations
Quality: 96/100 ✓
```
### Example 3: Research Synthesis
```
Objective: "Research state management patterns for React 2024"
Phase 2:
├─ Claude: Reviews React docs + examples
├─ Gemini: Web research "React state management 2024"
└─ Codex: Analyzes top 50 React repos
Phase 3:
└─ Pattern analysis: 5 major approaches identified
Phase 4:
└─ Synthesis by use case:
- Small apps: Context (all sources agree)
- Medium apps: Zustand (Gemini + Codex recommend)
- Large apps: Redux Toolkit (battle-tested, Codex data)
- Server state: TanStack Query (trending, Gemini research)
Final Report:
├─ Decision tree by project size
├─ Pros/cons with sources
├─ Migration strategies
└─ Code examples from Codex
Quality: 98/100 ✓
```
---
## Best Practices
### 1. Be Specific with Objectives
```
❌ "Analyze the code"
✅ "Security analysis of authentication module for OWASP Top 10 compliance"
```
### 2. Trust the Verification
Multi-pass verification catches issues. If quality <95, iteration happens automatically.
### 3. Review External Memory
Check `.analysis/` folder during execution to see progress.
### 4. Leverage Citations
Every claim has file:line or URL. Use for validation and deep dives.
### 5. Multi-Session Projects
Large projects can span sessions:
```
Session 1: Initial analysis → ITERATION_1.md
Session 2: Gap filling → ITERATION_2.md
Session 3: Final polish → ANALYSIS_FINAL.md
```
### 6. Check All Three Perspectives
High-value insights often come from comparing AI perspectives.
---
## Troubleshooting
### Low Quality Score (<95)
**Cause**: Gaps in coverage or missing citations
**Solution**: Automatic iteration 2 fills gaps
**Check**: `.analysis/verification/cross-check.md` for details
### Missing Citations
**Cause**: Verification flags uncited claims
**Solution**: Iteration adds missing attributions
**Prevention**: All agents trained to cite sources
### Gemini/Codex Unavailable
**Fallback**: Claude-only analysis with warning
**Impact**: Reduced perspectives but still comprehensive
**Install**: `npm install -g @google/gemini-cli @openai/codex`
### Conflicting Information
**Resolution**: Synthesis phase investigates conflicts
**Method**: Check ground truth (actual code/docs)
**Output**: Documented reasoning for resolution
---
## Related Skills
- `anthropic-expert`: Anthropic product expertise
- `codex-cli`: Codex integration patterns
- `gemini-cli`: Gemini integration patterns
- `tri-ai-collaboration`: General tri-AI workflows
- `analysis`: Code/skill/process analysis
---
## Quick Reference
### Command Line
```bash
# Full automated analysis
bash .claude/skills/multi-ai-research/scripts/analyze.sh "objective"
# Interactive with Claude Code
# Just ask: "Use multi-ai-research for [objective]"
```
### File Locations
| File | Purpose |
|------|---------|
| `.analysis/ANALYSIS_PLAN.md` | Strategy and assignments |
| `.analysis/research/` | All AI research outputs |
| `.analysis/SYNTHESIS_REPORT.md` | Multi-source synthesis |
| `.analysis/ANALYSIS_FINAL.md` | Complete final report |
### Quality Metrics
| Metric | Threshold | Meaning |
|--------|-----------|---------|
| Quality Score | ≥95/100 | Production-ready |
| Citation Coverage | 100% | All claims sourced |
| Completeness | ≥95% | All objectives met |
| Critical Gaps | 0 | No missing essentials |
### Analysis Time Estimates
| Type | Time | Iterations |
|------|------|------------|
| Security | 45-60 min | 1-2 |
| Architecture | 60-90 min | 1-2 |
| Code Quality | 30-45 min | 1 |
| Performance | 45-60 min | 1-2 |
| Research | 30-60 min | 1 |
---
**multi-ai-research delivers production-ready analysis through systematic multi-AI collaboration, rigorous verification, and iterative refinement - ensuring nothing is missed and every claim is verified.**Related Skills
openrouter-research
Research OpenRouter API docs, available Grok model IDs, vision capability for the judge service, and integration patterns. Use when implementing openrouter_tool.py, when checking which Grok model supports vision/image input for judge_service.py, when OpenRouter returns unexpected errors, or when verifying model availability and context limits.
multi-model-reviewer
協調多個 AI 模型(ChatGPT、Gemini、Codex、QWEN、Claude)進行三角驗證,確保「Specification == Program == Test」一致性。過濾假警報後輸出報告,大幅減少人工介入時間。
multi-ai
Start the multi-AI pipeline with a given request. Guides through plan -> review -> implement -> review workflow.
multi-agent-patterns
Supervisor, swarm, and hierarchical multi-agent architectures with context isolation patterns.
multi-agent-estimation
Build multi-agent AI systems for construction estimation. Use CrewAI/LangGraph to orchestrate specialized agents: QTO agent, pricing agent, validation agent. Automate complex estimation workflows.
gpt-researcher
Run GPT-Researcher multi-agent deep research framework locally using OpenAI GPT-5.2. Replaces ChatGPT Deep Research with local control. Researches 100+ sources in parallel, provides comprehensive citations. Use for Phase 3 industry/technical research or comprehensive synthesis. Takes 6-20 min depending on report type. Supports multiple LLM providers.
error-debugging-multi-agent-review
Use when working with error debugging multi agent review
deep-research
Web research with Graph-of-Thoughts for fast-changing topics. Use when user requests research, analysis, investigation, or comparison requiring current information. Features hypothesis testing, source triangulation, claim verification, Red Team, self-critique, and gap analysis. Supports Quick/Standard/Deep/Exhaustive tiers. Creative Mode for cross-industry innovation.
brutal-deepresearch
Structured deep research pipeline with confirmation gates and resume support. Generates outline, launches parallel research agents, produces validated JSON results and markdown report.
agent-multi-repo-swarm
Agent skill for multi-repo-swarm - invoke with $agent-multi-repo-swarm
agent-market-researcher
Expert market researcher specializing in market analysis, consumer insights, and competitive intelligence. Masters market sizing, segmentation, and trend analysis with focus on identifying opportunities and informing strategic business decisions.
agent-data-researcher
Expert data researcher specializing in discovering, collecting, and analyzing diverse data sources. Masters data mining, statistical analysis, and pattern recognition with focus on extracting meaningful insights from complex datasets to support evidence-based decisions.