building-multiagent-systems

This skill should be used when designing or implementing systems with multiple AI agents that coordinate to accomplish tasks. Triggers on "multi-agent", "orchestrator", "sub-agent", "coordination", "delegation", "parallel agents", "sequential pipeline", "fan-out", "map-reduce", "spawn agents", "agent hierarchy".

25 stars

byComeOnOliver

View on GitHub Installation ↓

Best use case

building-multiagent-systems is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using building-multiagent-systems should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/building-multiagent-systems/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/aiskillstore/marketplace/2389-research/building-multiagent-systems/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/building-multiagent-systems/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How building-multiagent-systems Compares

Feature / Agent	building-multiagent-systems	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Building Multi-Agent, Tool-Using Agentic Systems

## Overview

Comprehensive architecture patterns for multi-agent systems where AI agents coordinate to accomplish complex tasks using tools. Language-agnostic and applicable across TypeScript, Python, Go, Rust, and other environments.

## Discovery Questions (Required)

Before architecting any system, ask these six mandatory questions:

1. **Starting Point** - Greenfield, adding to existing system, or fixing current implementation?
2. **Primary Use Case** - Parallel work, sequential pipeline, recursive delegation, peer collaboration, work queues, or other?
3. **Scale Expectations** - Small (2-5 agents), medium (10-50), or large (100+)?
4. **State Requirements** - Stateless runs, session-based, or persistent across crashes?
5. **Tool Coordination** - Independent agents, shared read-only resources, write coordination, or rate-limited APIs?
6. **Existing Constraints** - Language, framework, performance needs, compliance requirements?

## Foundational Architecture

### Four-Layer Stack

Every agent follows the four-layer architecture for testability, safety, and modularity:

| Layer | Name | Responsibility |
|-------|------|----------------|
| 1 | Reasoning (LLM) | Plans, critiques, decides which tools to call |
| 2 | Orchestration | Validates, routes, enforces policy, spawns sub-agents |
| 3 | Tool Bus | Schema validation, tool execution coordination |
| 4 | Deterministic Adapters | File I/O, APIs, shell commands, database access |

**Critical Rule**: Everything below Layer 1 must be deterministic. No LLM calls in tools.

See `references/four-layer-architecture.md` for detailed implementation with code examples.

### Foundational Patterns

| Pattern | Purpose |
|---------|---------|
| **Event-Sourcing** | All state changes as events for audit trails and replay |
| **Hierarchical IDs** | Encode delegation hierarchy (e.g., `session.1.2`) for cost aggregation |
| **Agent State Machines** | Explicit states (idle → thinking → tool_execution → stopped) with invalid transition errors |
| **Communication** | EventEmitter for state changes, promises for result collection |

## Seven Coordination Patterns

Choose based on discovery question answers:

| Pattern | Use Case | Trade-offs |
|---------|----------|------------|
| **Fan-Out/Fan-In** | Parallel independent work | Fast but costly; watch for orphans |
| **Sequential Pipeline** | Multi-stage transformations | Bottleneck at slowest stage |
| **Recursive Delegation** | Hierarchical task breakdown | Must add depth limits |
| **Work-Stealing Queue** | 1000+ tasks with load balancing | No built-in priority |
| **Map-Reduce** | Cost optimization | Cheap map ($0.01), smart reduce ($0.15) |
| **Peer Collaboration** | LLM council for bias reduction | Expensive (3N+1 calls), slow |
| **MAKER** | Zero-error tasks (100K+ steps) | 5× cost but ~0% error rate |

See `references/coordination-patterns.md` for detailed implementations.

### Pattern Selection Guide

| Requirement | Recommended Pattern |
|-------------|---------------------|
| Parallel independent tasks | Fan-Out/Fan-In |
| Each stage depends on previous | Sequential Pipeline |
| Complex task decomposition | Recursive Delegation |
| Large batch processing | Work-Stealing Queue |
| Cost-sensitive analysis | Map-Reduce |
| Need diverse perspectives | Peer Collaboration |
| Zero error tolerance | MAKER |

### MAKER Pattern (Zero Errors)

For tasks requiring 100K+ steps with zero error tolerance (medical, financial, legal domains):

1. **Extreme Decomposition** - Recursive breakdown until each subtask <100 steps
2. **Microagents** - Single tool, focused expertise, cheap models
3. **Multi-Agent Voting** - N parallel attempts per subtask, majority consensus
4. **Error Correction** - Deterministic validation + retry with failure context

**Cost comparison**: Same cost as traditional approach, zero errors vs. 10+ errors.

See `references/maker-pattern.md` for full implementation with medical diagnosis example.

## Tool Coordination

| Mechanism | Purpose |
|-----------|---------|
| **Permission Inheritance** | Children inherit subset of parent permissions (cannot escalate) |
| **Resource Locking** | Acquire/release patterns for shared resources |
| **Rate Limiting** | Token bucket algorithm across all agents |
| **Result Caching** | Cache read-only, idempotent, expensive operations |

**Sub-Agent as Tool Pattern**: Wrap specialized agents as tools the parent can call, providing composable abstractions and natural lifecycle management.

See `references/tool-coordination.md` for implementations.

## Critical Lifecycle: Cascading Stop

"Always stop children before stopping self." This prevents orphaned agents.

```
1. Get all child agents
2. Stop all children in parallel
3. Stop self
4. Cancel ongoing work
5. Flush events
```

If pause/resume unavailable, implement manual checkpointing: save agent state (messages, context, tool results), then restore later.

## Production Hardening

| Concern | Solution |
|---------|----------|
| **Orphan Detection** | Heartbeat monitoring every 30 seconds |
| **Cost Tracking** | Hierarchical aggregation across agent tree |
| **Session Persistence** | Project-level task store for cross-session work |
| **Checkpointing** | Save after 10+ tools, $1.00 cost, or 5 minutes elapsed |
| **Self-Modification Safety** | Blast radius assessment, branch isolation, test-first |

See `references/production-hardening.md` for detailed implementations.

## Real-World Example: Code Review System

A pull request orchestrator using Fan-Out/Fan-In:

1. Spawns four specialist reviewers in parallel (security, performance, style, tests)
2. Security and tests use smart models (Sonnet); style and performance use fast models (Haiku)
3. Each reviewer has 2-minute timeout
4. Results aggregate regardless of partial failures
5. Costs track per reviewer
6. All agents stop cleanly via cascading stop after completion

## Execution Checklist

When guiding implementation of multi-agent systems:

1. **Ask discovery questions** - Understand requirements before architecting
2. **Assess error tolerance** - Zero errors → MAKER; some acceptable → simpler patterns
3. **Establish four-layer architecture** - Reasoning, orchestration, tool bus, adapters
4. **Design schema-first tools** - Typed contracts before implementation
5. **Define deterministic boundary** - No LLM in Layers 3-4
6. **Choose orchestration model** - YOLO, Safety-First, or Hybrid
7. **Select coordination pattern** - Fan-out, pipeline, delegation, queue, map-reduce, peer, or MAKER
8. **Design tool coordination** - Permission inheritance, locking, rate limiting
9. **Implement cascading cleanup** - Always stop children before parent
10. **Add monitoring and cost tracking** - Hierarchical aggregation across agent tree
11. **Consider self-modification safety** - If agents can modify code, add safety protocol

## Common Pitfalls

| Pitfall | Impact |
|---------|--------|
| Missing four-layer architecture | Untestable, unsafe, hard to debug |
| LLM calls in tools (Layer 3-4) | Non-deterministic, can't unit test |
| No schema-first tool design | Sub-agents can't discover tools |
| Missing cascading stop | Orphaned agents consuming resources |
| No permission inheritance | Sub-agents can escalate privileges |
| No timeouts | Indefinite hangs waiting for sub-agents |
| Unbounded concurrency | Resource exhaustion from too many agents |
| Ignoring cost tracking | Budget surprises |
| No partial-failure handling | One failure cascades to all agents |
| Unpersisted state | Unrecoverable workflows on crash |
| Uncoordinated tool access | Race conditions on shared resources |
| Wrong model selection | Cost inefficiency (Sonnet for simple tasks) |
| Self-modification without safety | Sub-agents break themselves |
| No heartbeat monitoring | Can't detect orphans after parent crash |

## Reference Files

Detailed implementations with code examples:

| File | Contents |
|------|----------|
| `references/four-layer-architecture.md` | Four-layer stack, deterministic boundary, schema-first tools |
| `references/coordination-patterns.md` | Seven coordination patterns with code |
| `references/maker-pattern.md` | MAKER implementation, voting, medical diagnosis example |
| `references/tool-coordination.md` | Permission inheritance, locking, rate limiting, caching |
| `references/production-hardening.md` | Cascading stop, orphan detection, cost tracking, checkpointing |

Related Skills

building-terraform-modules

from ComeOnOliver/skillshub

This skill empowers Claude to build reusable Terraform modules based on user specifications. It leverages the terraform-module-builder plugin to generate production-ready, well-documented Terraform module code, incorporating best practices for security, scalability, and multi-platform support. Use this skill when the user requests to create a new Terraform module, generate Terraform configuration, or needs help structuring infrastructure as code using Terraform. The trigger terms include "create Terraform module," "generate Terraform configuration," "Terraform module code," and "infrastructure as code."

building-recommendation-systems

from ComeOnOliver/skillshub

This skill empowers Claude to construct recommendation systems using collaborative filtering, content-based filtering, or hybrid approaches. It analyzes user preferences, item features, and interaction data to generate personalized recommendations. Use this skill when the user requests to build a recommendation engine, needs help with collaborative filtering, wants to implement content-based filtering, or seeks to rank items based on relevance for a specific user or group of users. It is triggered by requests involving "recommendations", "collaborative filtering", "content-based filtering", "ranking items", or "building a recommender".

orchestrating-multi-agent-systems

from ComeOnOliver/skillshub

Execute orchestrate multi-agent systems with handoffs, routing, and workflows across AI providers. Use when building complex AI systems requiring agent collaboration, task delegation, or workflow coordination. Trigger with phrases like "create multi-agent system", "orchestrate agents", or "coordinate agent workflows".

building-neural-networks

from ComeOnOliver/skillshub

This skill allows Claude to construct and configure neural network architectures using the neural-network-builder plugin. It should be used when the user requests the creation of a new neural network, modification of an existing one, or assistance with defining the layers, parameters, and training process. The skill is triggered by requests involving terms like "build a neural network," "define network architecture," "configure layers," or specific mentions of neural network types (e.g., "CNN," "RNN," "transformer").

building-gitops-workflows

from ComeOnOliver/skillshub

This skill enables Claude to construct GitOps workflows using ArgoCD and Flux. It is designed to generate production-ready configurations, implement best practices, and ensure a security-first approach for Kubernetes deployments. Use this skill when the user explicitly requests "GitOps workflow", "ArgoCD", "Flux", or asks for help with setting up a continuous delivery pipeline using GitOps principles. The skill will generate the necessary configuration files and setup code based on the user's specific requirements and infrastructure.

building-classification-models

from ComeOnOliver/skillshub

This skill enables Claude to construct and evaluate classification models using provided datasets or specifications. It leverages the classification-model-builder plugin to automate model creation, optimization, and reporting. Use this skill when the user requests to "build a classifier", "create a classification model", "train a classification model", or needs help with supervised learning tasks involving labeled data. The skill ensures best practices are followed, including data validation, error handling, and performance metric reporting.

building-websocket-server

from ComeOnOliver/skillshub

Build scalable WebSocket servers for real-time bidirectional communication. Use when enabling real-time bidirectional communication. Trigger with phrases like "build WebSocket server", "add real-time API", or "implement WebSocket".

building-graphql-server

from ComeOnOliver/skillshub

Build production-ready GraphQL servers with schema design, resolvers, and subscriptions. Use when building GraphQL APIs with schemas and resolvers. Trigger with phrases like "build GraphQL API", "create GraphQL server", or "setup GraphQL".

building-cicd-pipelines

from ComeOnOliver/skillshub

Execute use when you need to work with deployment and CI/CD. This skill provides deployment automation and pipeline orchestration with comprehensive guidance and automation. Trigger with phrases like "deploy application", "create pipeline", or "automate deployment".

building-automl-pipelines

from ComeOnOliver/skillshub

Build automated machine learning pipelines with feature engineering, model selection, and hyperparameter tuning. Use when automating ML workflows from data preparation through model deployment. Trigger with phrases like "build automl pipeline", "automate ml workflow", or "create automated training pipeline".

building-api-gateway

from ComeOnOliver/skillshub

Create API gateways with routing, load balancing, rate limiting, and authentication. Use when routing and managing multiple API services. Trigger with phrases like "build API gateway", "create API router", or "setup API gateway".

building-api-authentication

from ComeOnOliver/skillshub

Build secure API authentication systems with OAuth2, JWT, API keys, and session management. Use when implementing secure authentication flows. Trigger with phrases like "build authentication", "add API auth", or "secure the API".