skeleton-of-thought

Parallel generation through skeleton-first approach for 2x speedup

170 stars

Best use case

skeleton-of-thought is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Parallel generation through skeleton-first approach for 2x speedup

Teams using skeleton-of-thought should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/skeleton-of-thought/SKILL.md --create-dirs "https://raw.githubusercontent.com/Miosa-osa/canopy/main/library/skills/ai-patterns/skeleton-of-thought/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/skeleton-of-thought/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How skeleton-of-thought Compares

Feature / Agentskeleton-of-thoughtStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Parallel generation through skeleton-first approach for 2x speedup

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Skeleton-of-Thought (SoT) Skill

Achieves up to 2x speedup through parallel content generation.

## Core Concept

Instead of sequential generation, SoT:
1. First generates a skeleton (outline) of the answer
2. Then expands each skeleton point in PARALLEL
3. Finally assembles the complete response

## When to Use

- Long-form responses (>500 words expected)
- Structured content (lists, tutorials, explanations)
- Time-sensitive queries
- Batch processing scenarios

## When NOT to Use

- Math/reasoning problems (need sequential thought)
- Simple factual queries
- Creative writing requiring flow
- Code generation requiring context

## Process

### Step 1: Skeleton Generation
```
Given question, generate ONLY the skeleton outline:
- Point 1: [brief description]
- Point 2: [brief description]
- Point 3: [brief description]
...
Do NOT expand. Just the skeleton.
```

### Step 2: Parallel Expansion
Launch parallel expansions for each point:
```
Point 1 → Agent 1 → Expanded content
Point 2 → Agent 2 → Expanded content
Point 3 → Agent 3 → Expanded content
(all run simultaneously)
```

### Step 3: Assembly
```
Combine expanded points with transitions:
[Introduction]
[Point 1 expanded]
[Transition]
[Point 2 expanded]
[Transition]
[Point 3 expanded]
[Conclusion]
```

## Implementation

```python
# Skeleton generation prompt
SKELETON_PROMPT = """
For the question: {question}

Generate ONLY a skeleton outline with 3-8 key points.
Format:
1. [Point]: [5-10 word description]
2. [Point]: [5-10 word description]
...

Do NOT expand the points. ONLY the skeleton.
"""

# Point expansion prompt
EXPAND_PROMPT = """
Context: Answering "{question}"
Skeleton: {skeleton}

Expand ONLY point {point_number}: "{point_description}"

Write 2-4 sentences expanding this point.
Do not include other points.
"""
```

## Performance

| Query Type | Sequential Time | SoT Time | Speedup |
|------------|-----------------|----------|---------|
| Tutorial   | 10s             | 5s       | 2.0x    |
| Explanation| 8s              | 4.5s     | 1.8x    |
| List-based | 12s             | 6s       | 2.0x    |
| Analysis   | 15s             | 9s       | 1.7x    |

## Quality Preservation

- Skeleton ensures coherent structure
- Each expansion has full question context
- Assembly adds smooth transitions
- Quality remains high despite parallelism

## Combination with Other Skills

- **Self-Consistency**: Generate multiple skeletons, vote on best
- **ToT**: Use ToT for skeleton generation, SoT for expansion
- **LATS**: Apply LATS scoring to skeleton options

---

*Reference: "Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding" (2023)*