brainstorming

Collaborative ideation and planning with resilient multi-model exploration, consensus scoring, and adaptive confidence-based validation

248 stars

byMadAppGang

View on GitHub Installation ↓

Best use case

brainstorming is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Collaborative ideation and planning with resilient multi-model exploration, consensus scoring, and adaptive confidence-based validation

Teams using brainstorming should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/brainstorming/SKILL.md --create-dirs "https://raw.githubusercontent.com/MadAppGang/claude-code/main/plugins/dev/skills/planning/brainstorming/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/brainstorming/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How brainstorming Compares

Feature / Agent	brainstorming	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Collaborative ideation and planning with resilient multi-model exploration, consensus scoring, and adaptive confidence-based validation

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Brainstorming v2.0: Resilient Multi-Model Planning

Turn ideas into validated designs through collaborative AI dialogue with resilient model execution and confidence-based validation.

## Overview

This skill improves upon v1.0 by addressing critical reliability gaps:

**Key v2.0 Improvements:**
- **No AskUserQuestion dependency**: Uses Task + Tasks for structured interaction
- **Fallback chains**: 3+ models per role ensures completion even if some fail
- **Explicit parallelism**: Documented Task call patterns for parallel execution
- **Defined algorithms**: Consensus matrix and confidence scoring are mathematically specified

## When to Use

Use this skill BEFORE implementing any feature:
- "Design a user authentication system"
- "Brainstorm approaches for API rate limiting"
- "Plan architecture for a new dashboard feature"
- "Evaluate options for real-time data synchronization"

## Prerequisites

### Required Setup

```bash
# 1. Install required skills
/plugin marketplace add MadAppGang/claude-code
skill install superpowers:using-git-worktrees
skill install superpowers:writing-plans

# 2. Verify OpenRouter access (for multi-model)
export OPENROUTER_API_KEY=your-key

# 3. Configure models in ~/.claude/settings.json
{
  "brainstorming": {
    "primary_model": "anthropic/claude-opus-4-20250514",
    "explorer_models": [
      "x-ai/grok-code-fast-1",
      "google/gemini-2-5-pro",
      "anthropic/claude-sonnet-4-20250514"
    ]
  }
}
```

### Model Requirements

| Role | Min Context | Capabilities |
|------|-------------|--------------|
| Primary | 200K tokens | Complex reasoning, orchestration |
| Explorer | 100K tokens | Code generation, analysis |

## Workflow

### Phase 0: Problem Analysis (200-300 words)

**Objective**: Capture problem scope, constraints, and success criteria

**How to Ask Users (Without AskUserQuestion)**:

```typescript
// Pattern: Use Tasks to track questions, Read/Write for presentation

// 1. Write question to temp file
await Write({
  file_path: "/tmp/brainstorm-q1.md",
  content: `## Question 1 of 3

**What are the main constraints or requirements for this feature?**

Please respond with:
- Functional requirements (what it must do)
- Non-functional requirements (performance, scale)
- Any existing dependencies or integrations
`
});

// 2. Present file and wait for user response
// User reads file, provides input via conversation

// 3. Summarize understanding
const problemSummary = await Write({
  file_path: "/tmp/brainstorm-problem.md",
  content: `## Problem Understanding

**Constraints identified:**
- [From user response]

**Success criteria:**
- [Measurable outcomes]

**Scope boundaries:**
- [What's in/out]

---

**Does this accurately capture the problem?** (Reply "yes" to proceed or clarify)
`
});
```

**Gate Type**: USER_GATE (requires confirmation)

---

### Phase 1: Parallel Exploration

**Objective**: Generate diverse solutions via multi-model brainstorming

**Fallback Chain Implementation**:

```typescript
interface ModelResult {
  model: string;
  success: boolean;
  output?: string;
  error?: string;
}

async function exploreWithFallback(
  prompt: string,
  role: "explorer"
): Promise<ModelResult> {
  const fallbackModels = role === "explorer"
    ? ["x-ai/grok-code-fast-1", "google/gemini-2-5-pro", "deepseek/deepseek-coder"]
    : ["anthropic/claude-opus-4-20250514", "anthropic/claude-sonnet-4-20250514"];

  for (const model of fallbackModels) {
    try {
      const result = await Task({
        model: model,
        prompt: prompt,
        timeout_ms: 120000  // 2 minute timeout
      });

      return { model, success: true, output: result };
    } catch (error) {
      console.warn(`Model ${model} failed:`, error.message);
      continue;  // Try next in chain
    }
  }

  throw new Error(`All models in fallback chain failed`);
}
```

**Parallel Execution Pattern**:

```typescript
// WRONG: Sequential (slow)
// const result1 = await Task({ model: "grok", ... });
// const result2 = await Task({ model: "gemini", ... });
// const result3 = await Task({ model: "sonnet", ... });

// CORRECT: Parallel (3-5x faster)
const [result1, result2, result3] = await Promise.all([
  Task({
    model: "x-ai/grok-code-fast-1",
    prompt: generateExplorerPrompt(problem, "fast_code")
  }),
  Task({
    model: "google/gemini-2-5-pro",
    prompt: generateExplorerPrompt(problem, "balanced")
  }),
  Task({
    model: "anthropic/claude-sonnet-4-20250514",
    prompt: generateExplorerPrompt(problem, "thorough")
  })
]);

// Handle partial failures
const results = [result1, result2, result3].filter(r => r.success);
if (results.length === 0) {
  throw new Error("All exploration models failed");
}
```

**Output Format**:
```markdown
## Approach: [Name]

**Model**: [Which model generated this]
**Approach Type**: [architecture/algorithm/pattern]
**Summary**: 2-3 sentences

**Key Components**:
1. Component A
2. Component B
3. Component C

**Trade-offs**:
- + Advantage
- - Disadvantage

**Confidence**: [Model's confidence 0-100]
```

**Gate Type**: AUTO_GATE (automatic consolidation)

---

### Phase 2: Consensus Analysis

**Objective**: Identify strongest ideas using defined algorithms

**Consensus Matrix Algorithm**:
1. **Clustering**: Group approaches by semantic similarity (vector embedding + clustering)
2. **Scoring**: Count model agreement per cluster
3. **Classification**: UNANIMOUS (3/3), STRONG (2/3), DIVERGENT (1/3)
4. **Confidence**: Weighted average of model confidences + agreement bonus

**Consensus Matrix Calculation**:

```typescript
interface Approach {
  id: string;
  name: string;
  summary: string;
  model: string;  // Which model proposed
  modelConfidence: number;  // 0-100
  embedding: number[];  // For clustering
}

interface Cluster {
  approaches: Approach[];
  representative: Approach;  // Most complete
  agreementScore: number;  // 0-1
  confidenceScore: number;  // 0-100
  consensusLevel: "UNANIMOUS" | "STRONG" | "DIVERGENT";
}

function calculateConsensus(approaches: Approach[]): Cluster[] {
  // Step 1: Cluster by semantic similarity
  const clusters = clusterByEmbedding(approaches, threshold: 0.85);

  // Step 2: Calculate metrics per cluster
  return clusters.map(cluster => {
    const models = cluster.map(a => a.model);
    const modelCount = new Set(models).size;
    const totalModels = approaches.length;

    // Agreement: proportion of models that have an approach in this cluster
    const agreementScore = modelCount / totalModels;

    // Confidence: weighted average + agreement bonus
    const baseConfidence = cluster
      .map(a => a.modelConfidence)
      .reduce((a, b) => a + b, 0) / cluster.length;

    const confidenceScore = Math.min(100,
      baseConfidence + (agreementScore * 20)  // +20% for agreement
    );

    // Consensus classification
    const consensusLevel = agreementScore >= 0.9 ? "UNANIMOUS" :
                          agreementScore >= 0.5 ? "STRONG" :
                          "DIVERGENT";

    return {
      approaches: cluster,
      representative: cluster.reduce((best, current) =>
        current.modelConfidence > best.modelConfidence ? current : best
      ),
      agreementScore,
      confidenceScore: Math.round(confidenceScore),
      consensusLevel
    };
  }).sort((a, b) => b.confidenceScore - a.confidenceScore);
}
```

**Confidence Scoring Formula**:

```
Confidence = Base + AgreementBonus - DiversityPenalty

Where:
  Base = average(model confidences in cluster)
  AgreementBonus = (unique_models / total_models) * 20
  DiversityPenalty = (1 - similarity_coefficient) * 10

Example:
  3 models propose similar approaches
  Base = (92 + 88 + 95) / 3 = 91.7
  AgreementBonus = (3/3) * 20 = 20
  DiversityPenalty = (1 - 0.9) * 10 = 1
  Confidence = 91.7 + 20 - 1 = 110.7 -> capped at 100
  Final: 97%
```

**Consensus Matrix Example**:

| Approach | Grok | Gemini | Sonnet | Agreement | Confidence |
|----------|------|--------|--------|-----------|------------|
| Token Bucket | Yes | Yes | Yes | UNANIMOUS | 97% |
| Leaky Bucket | Yes | Yes | No | STRONG | 82% |
| Sliding Window | No | No | Yes | DIVERGENT | 45% |

**Gate Type**: AUTO_GATE (automatic scoring)

---

### Phase 3: User Selection

**Objective**: Present top approaches for user decision

**Presentation Pattern**:

```typescript
async function presentApproaches(clusters: Cluster[]): Promise<string> {
  const topClusters = clusters.slice(0, 5);  // Top 5

  let presentation = `## Top Approaches\n\n`;

  for (const [index, cluster] of topClusters.entries()) {
    const approach = cluster.representative;

    presentation += `### ${String.fromCharCode(65 + index)}: ${approach.name} [${cluster.consensusLevel}]

**Summary**: ${approach.summary}

**Confidence**: ${cluster.confidenceScore}% (${cluster.approaches.length} model(s) agree)

**Pros**:
${cluster.approaches.map(a => `- ${a.summary}`).join("\n")}

**Cons**:
${cluster.approaches.map(a => `- Potential issue from ${a.model}`).join("\n")}

---
`;
  }

  presentation += `
## Your Choice

Which approach best fits your requirements?

- **A**: Select approach A
- **B**: Select approach B
- **C**: Select approach C
- **D**: Combine elements from multiple
- **E**: Explore alternatives (return to Phase 1)
`;

  // Save for user review
  await Write({
    file_path: "/tmp/brainstorm-approaches.md",
    content: presentation
  });

  return presentation;
}
```

**Gate Type**: USER_GATE (selection via conversation)

---

### Phase 4: Detailed Planning

**Objective**: Elaborate selected approach into actionable sections

**Confidence-Based Gating**:

| Confidence | Gate Type | Action |
|------------|-----------|--------|
| >=95% | AUTO_GATE | Proceed automatically |
| 80-94% | AUTO_GATE | Proceed with notification |
| 60-79% | USER_GATE | Request confirmation |
| <60% | USER_GATE | Require revision |

**Section Template**:

```markdown
## [Section Name] (Confidence: XX%)

**Approach**: [Selected approach]

**Implementation Details**:
[200-300 words]

**Assumptions**:
- Assumption 1
- Assumption 2

**Confidence Calculation**:
- Technical feasibility: XX%
- Edge cases covered: XX%
- Team capability: XX%
- Overall: XX%

**Status**: [AUTO_GATE|PENDING_USER] - [Reason]
```

**Gate Type**: MIXED_GATE (adaptive)

---

### Phase 5: Plan Validation

**Objective**: Final review before implementation

**Validation Checklist**:

```markdown
## Plan Validation

**Problem**: [Summary]
**Approach**: [Selected]
**Confidence**: [Overall]

### Checklist

- [ ] Problem scope accurately captured
- [ ] Chosen approach matches expectations
- [ ] Module structure aligns with capabilities
- [ ] Technical constraints addressed
- [ ] Success criteria measurable

### Next Steps

**To proceed**:
1. Reply "approve" to finalize
2. Reply "revise [section]" to modify
3. Reply "restart" to begin fresh

**Final decision?**
```

**Gate Type**: USER_GATE (explicit approval)

---

## Complete Parallel Execution Example

```typescript
// Complete Phase 1 parallel exploration
async function runParallelExploration(problem: string): Promise<Approach[]> {
  const explorerModels = [
    "x-ai/grok-code-fast-1",     // Fast, code-focused
    "google/gemini-2-5-pro",     // Balanced, creative
    "anthropic/claude-sonnet-4-20250514"  // Thorough
  ];

  const prompts = explorerModels.map(model =>
    `Generate 5 implementation approaches for: ${problem}

For each approach provide:
1. Name (2-3 words)
2. One-sentence summary
3. Key components (bullet points)
4. Trade-offs (+/-)
5. Your confidence (0-100)

Format as JSON array.`
  );

  // LAUNCH ALL MODELS IN PARALLEL
  const taskPromises = explorerModels.map((model, index) =>
    Task({
      model: model,
      prompt: prompts[index],
      timeout_ms: 120000,
      max_turns: 1
    }).catch(error => ({
      model,
      success: false,
      error: error.message
    }))
  );

  // WAIT FOR ALL TO COMPLETE
  const results = await Promise.all(taskPromises);

  // CONSOLIDATE SUCCESSFUL RESULTS
  const approaches: Approach[] = results
    .filter(r => r.success)
    .flatMap(r => parseApproaches(r.output));

  // HANDLE PARTIAL FAILURES
  if (approaches.length < 5) {
    console.warn(`Only got ${approaches.length} approaches from ${explorerModels.length} models`);
    if (approaches.length === 0) {
      throw new Error("All models failed");
    }
  }

  return approaches;
}
```

## Troubleshooting

### Model Failures

| Symptom | Cause | Solution |
|---------|-------|----------|
| Single model fails | API error, timeout | Fallback chain handles automatically |
| All models fail | API key issue, network | Check `OPENROUTER_API_KEY`, retry |
| Partial results (2/3) | One model unavailable | Continue with available; lower diversity but valid |

**Recovery Pattern**:
```typescript
async function resilientExploration(problem: string): Promise<Approach[]> {
  let attempts = 0;
  const maxAttempts = 3;

  while (attempts < maxAttempts) {
    try {
      return await runParallelExploration(problem);
    } catch (error) {
      attempts++;
      if (attempts === maxAttempts) throw error;

      // Exponential backoff
      await new Promise(r => setTimeout(r, Math.pow(2, attempts) * 1000));
    }
  }
}
```

### Consensus Issues

| Symptom | Cause | Solution |
|---------|-------|----------|
| All approaches DIVERGENT | Models produce very different ideas | Not a failure - indicates novel problem |
| Single cluster with 90%+ confidence | Problem is well-understood | Good for AUTO_GATE |
| No clear winner | Multiple valid approaches | Present all to user |

### User Interaction Issues

| Symptom | Cause | Solution |
|---------|-------|----------|
| User doesn't respond | Unclear question | Rewrite with specific format |
| User provides conflicting answers | Multiple questions at once | Ask one at a time, confirm understanding |
| User wants to restart | Dissatisfied with direction | Allow restart to Phase 0 |

---

## Configuration

### Environment Variables

```bash
# Required for multi-model
OPENROUTER_API_KEY=...

# Optional
BRAINSTORM_TIMEOUT_MS=120000
BRAINSTORM_MAX_RETRIES=2
BRAINSTORM_MIN_MODELS=2  # Minimum models for valid consensus
```

### Model Configuration

```json
{
  "brainstorming": {
    "primary": ["anthropic/claude-opus-4-20250514"],
    "explorers": {
      "primary_chain": ["x-ai/grok-code-fast-1", "google/gemini-2-5-pro"],
      "fallback_chain": ["deepseek/deepseek-coder", "anthropic/claude-haiku-3-20250514"]
    },
    "thresholds": {
      "auto_gate": 95,
      "confirm_gate": 60
    }
  }
}
```

---

## Performance

| Metric | v1.0 | v2.0 | Improvement |
|--------|------|------|-------------|
| Model failure recovery | 0% | 95% | Fallback chains |
| Consensus calculation | Undefined | Defined | Mathematically specified |
| User interaction | AskUserQuestion | Task+Write | No tool dependency |
| Parallel execution | Implicit | Explicit | Documented patterns |

---

## Comparison: v1.0 vs v2.0

| Aspect | v1.0 | v2.0 |
|--------|------|------|
| AskUserQuestion | Required | Removed |
| Model fallbacks | None | 3+ per role |
| Parallel pattern | Described | Code example |
| Consensus algorithm | Table example | Full implementation |
| Confidence formula | Mentioned | Math specified |
| Troubleshooting | 4 items | 10+ items |
| Prerequisites | None | Setup guide |

---

## Version History

**v2.0.0** (2026-01-30):
- Removed AskUserQuestion dependency
- Added model fallback chains
- Explicit parallel execution patterns
- Defined consensus matrix algorithm
- Added confidence scoring formula
- Added prerequisites and setup guide
- Expanded troubleshooting section
- Winner of blind multi-model voting (3/3 votes, avg confidence 8.7/10)

**v1.0.0** (2026-01-30):
- Initial release
- 6-phase workflow
- Multi-model exploration
- Confidence-based gating

---

**Status**: v2.0 Ready for use
**Tested**: Fallback chains, parallel execution, consensus algorithm
**Known Limitations**: Requires OpenRouter for multi-model access

Related Skills

test-skill

248

from MadAppGang/claude-code

A test skill for validation testing. Use when testing skill parsing and validation logic.

bad-skill

248

from MadAppGang/claude-code

This skill has invalid YAML in frontmatter

release

248

from MadAppGang/claude-code

Plugin release process for MAG Claude Plugins marketplace. Covers version bumping, marketplace.json updates, git tagging, and common mistakes. Use when releasing new plugin versions or troubleshooting update issues.

openrouter-trending-models

248

from MadAppGang/claude-code

Fetch trending programming models from OpenRouter rankings. Use when selecting models for multi-model review, updating model recommendations, or researching current AI coding trends. Provides model IDs, context windows, pricing, and usage statistics from the most recent week.

Claudish Integration Skill

248

from MadAppGang/claude-code

**Version:** 1.0.0

transcription

248

from MadAppGang/claude-code

Audio/video transcription using OpenAI Whisper. Covers installation, model selection, transcript formats (SRT, VTT, JSON), timing synchronization, and speaker diarization. Use when transcribing media or generating subtitles.

final-cut-pro

248

from MadAppGang/claude-code

Apple Final Cut Pro FCPXML format reference. Covers project structure, timeline creation, clip references, effects, and transitions. Use when generating FCP projects or understanding FCPXML structure.

ffmpeg-core

248

from MadAppGang/claude-code

FFmpeg fundamentals for video/audio manipulation. Covers common operations (trim, concat, convert, extract), codec selection, filter chains, and performance optimization. Use when planning or executing video processing tasks.

statusline-customization

248

from MadAppGang/claude-code

Configuration reference and troubleshooting for the statusline plugin — sections, themes, bar widths, and script architecture

technical-audit

248

from MadAppGang/claude-code

Technical SEO audit methodology including crawlability, indexability, and Core Web Vitals analysis. Use when auditing pages or sites for technical SEO issues.

serp-analysis

248

from MadAppGang/claude-code

SERP analysis techniques for intent classification, feature identification, and competitive intelligence. Use when analyzing search results for content strategy.

schema-markup

248

from MadAppGang/claude-code

Schema.org markup implementation patterns for rich results. Use when adding structured data to content for enhanced SERP appearances.