system-design-patterns

System design patterns for scalability, reliability, and performance. Use when: (1) designing distributed systems, (2) planning for scale, (3) making architecture decisions, (4) evaluating trade-offs.

16 stars

Best use case

system-design-patterns is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

System design patterns for scalability, reliability, and performance. Use when: (1) designing distributed systems, (2) planning for scale, (3) making architecture decisions, (4) evaluating trade-offs.

Teams using system-design-patterns should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/system-design-patterns/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/system-design-patterns/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/system-design-patterns/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How system-design-patterns Compares

Feature / Agentsystem-design-patternsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

System design patterns for scalability, reliability, and performance. Use when: (1) designing distributed systems, (2) planning for scale, (3) making architecture decisions, (4) evaluating trade-offs.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# System Design Patterns

Design scalable, reliable, and performant systems with proven patterns.

## When to Use

- Designing new systems or features
- Evaluating architecture trade-offs
- Planning for scale
- Improving system reliability
- Making infrastructure decisions

## Core Principles

### CAP Theorem

| Property | Meaning | Trade-off |
|----------|---------|-----------|
| **C**onsistency | All nodes see the same data | Higher latency |
| **A**vailability | System responds to every request | May return stale data |
| **P**artition Tolerance | System works despite network failures | Must sacrifice C or A |

**Choose 2:**
- CP: Banking, inventory (consistency critical)
- AP: Social media, caching (availability critical)
- CA: Single-node systems only (no network partitions)

### ACID vs BASE

| ACID (Traditional RDBMS) | BASE (Distributed) |
|--------------------------|-------------------|
| Atomicity | Basically Available |
| Consistency | Soft state |
| Isolation | Eventually consistent |
| Durability | |

## Scalability Patterns

### Horizontal vs Vertical Scaling

```
Vertical Scaling (Scale Up)          Horizontal Scaling (Scale Out)
┌─────────────────────────┐         ┌──────┐ ┌──────┐ ┌──────┐
│                         │         │      │ │      │ │      │
│     Bigger Server       │    vs   │Server│ │Server│ │Server│
│                         │         │      │ │      │ │      │
│ More CPU, RAM, Storage  │         │      │ │      │ │      │
└─────────────────────────┘         └──────┘ └──────┘ └──────┘

Pros:                               Pros:
- Simple to implement               - Near-infinite scale
- No code changes                   - Fault tolerant
- Lower operational complexity      - Cost effective at scale

Cons:                               Cons:
- Hardware limits                   - Distributed complexity
- Single point of failure           - Data consistency challenges
- Expensive at scale                - More operational overhead
```

### Load Balancing Strategies

```csharp
// Strategy selection based on use case

public enum LoadBalancingStrategy
{
    // Simple, stateless services
    RoundRobin,

    // Varying server capacities
    WeightedRoundRobin,

    // Session affinity needed
    IpHash,

    // Optimal resource utilization
    LeastConnections,

    // Latency-sensitive applications
    LeastResponseTime,

    // Geographic distribution
    GeographicBased
}
```

| Strategy | Use Case | Trade-off |
|----------|----------|-----------|
| Round Robin | Stateless, homogeneous | No health awareness |
| Weighted | Different server sizes | Manual configuration |
| IP Hash | Session stickiness | Uneven distribution |
| Least Connections | Long-lived connections | Overhead tracking |
| Geographic | Global users | Complexity |

### Database Scaling

#### Read Replicas

```
┌─────────────────────────────────────────────────────┐
│                    Application                       │
└──────────────────────┬──────────────────────────────┘
                       │
        ┌──────────────┴──────────────┐
        │                             │
        ▼                             ▼
┌───────────────┐           ┌─────────────────┐
│  Primary DB   │──────────►│  Read Replica 1 │
│  (Writes)     │    Async  ├─────────────────┤
│               │──────────►│  Read Replica 2 │
└───────────────┘    Repl   └─────────────────┘
                                    ▲
                                    │
                              Read Queries
```

```csharp
// Read/Write splitting in ABP
public class PatientAppService : ApplicationService
{
    private readonly IReadOnlyRepository<Patient, Guid> _readRepository;
    private readonly IRepository<Patient, Guid> _writeRepository;

    // Reads go to replicas
    public async Task<PatientDto> GetAsync(Guid id)
    {
        var patient = await _readRepository.GetAsync(id);
        return ObjectMapper.Map<Patient, PatientDto>(patient);
    }

    // Writes go to primary
    public async Task<PatientDto> CreateAsync(CreatePatientDto input)
    {
        var patient = new Patient(GuidGenerator.Create(), input.Name);
        await _writeRepository.InsertAsync(patient);
        return ObjectMapper.Map<Patient, PatientDto>(patient);
    }
}
```

#### Database Sharding

```
┌─────────────────────────────────────────────────────────────┐
│                     Shard Router                            │
│         (Routes queries based on shard key)                 │
└────────────┬──────────────┬──────────────┬─────────────────┘
             │              │              │
             ▼              ▼              ▼
      ┌───────────┐  ┌───────────┐  ┌───────────┐
      │  Shard 1  │  │  Shard 2  │  │  Shard 3  │
      │  A - H    │  │  I - P    │  │  Q - Z    │
      │ (Users)   │  │ (Users)   │  │ (Users)   │
      └───────────┘  └───────────┘  └───────────┘
```

| Sharding Strategy | Pros | Cons |
|-------------------|------|------|
| Range-based | Simple, range queries work | Hotspots possible |
| Hash-based | Even distribution | Range queries need scatter-gather |
| Directory-based | Flexible | Lookup overhead, SPOF |
| Geographic | Data locality | Cross-region queries slow |

## Caching Patterns

### Cache-Aside (Lazy Loading)

```csharp
public class PatientService
{
    private readonly IDistributedCache _cache;
    private readonly IPatientRepository _repository;

    public async Task<PatientDto> GetAsync(Guid id)
    {
        var cacheKey = $"patient:{id}";

        // 1. Check cache
        var cached = await _cache.GetStringAsync(cacheKey);
        if (cached != null)
        {
            return JsonSerializer.Deserialize<PatientDto>(cached);
        }

        // 2. Cache miss - load from DB
        var patient = await _repository.GetAsync(id);
        var dto = ObjectMapper.Map<Patient, PatientDto>(patient);

        // 3. Populate cache
        await _cache.SetStringAsync(
            cacheKey,
            JsonSerializer.Serialize(dto),
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
            });

        return dto;
    }

    public async Task UpdateAsync(Guid id, UpdatePatientDto input)
    {
        // Update database
        var patient = await _repository.GetAsync(id);
        patient.Update(input.Name, input.Email);
        await _repository.UpdateAsync(patient);

        // Invalidate cache
        await _cache.RemoveAsync($"patient:{id}");
    }
}
```

### Write-Through Cache

```csharp
public async Task<PatientDto> CreateAsync(CreatePatientDto input)
{
    // 1. Write to database
    var patient = new Patient(GuidGenerator.Create(), input.Name);
    await _repository.InsertAsync(patient);

    // 2. Write to cache synchronously
    var dto = ObjectMapper.Map<Patient, PatientDto>(patient);
    await _cache.SetStringAsync(
        $"patient:{patient.Id}",
        JsonSerializer.Serialize(dto),
        new DistributedCacheEntryOptions
        {
            AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
        });

    return dto;
}
```

### Cache Strategies Comparison

| Pattern | Consistency | Performance | Use Case |
|---------|-------------|-------------|----------|
| Cache-Aside | Eventual | Read-heavy | User profiles |
| Write-Through | Strong | Write + Read | Financial data |
| Write-Behind | Eventual | Write-heavy | Analytics, logs |
| Read-Through | Eventual | Read-heavy | Reference data |

## Reliability Patterns

### Circuit Breaker

```csharp
// Using Polly
public class ExternalServiceClient
{
    private readonly HttpClient _client;
    private readonly AsyncCircuitBreakerPolicy _circuitBreaker;

    public ExternalServiceClient(HttpClient client)
    {
        _client = client;
        _circuitBreaker = Policy
            .Handle<HttpRequestException>()
            .CircuitBreakerAsync(
                exceptionsAllowedBeforeBreaking: 5,
                durationOfBreak: TimeSpan.FromSeconds(30),
                onBreak: (ex, duration) =>
                    Log.Warning("Circuit opened for {Duration}s", duration.TotalSeconds),
                onReset: () =>
                    Log.Information("Circuit closed"),
                onHalfOpen: () =>
                    Log.Information("Circuit half-open, testing...")
            );
    }

    public async Task<T> GetAsync<T>(string endpoint)
    {
        return await _circuitBreaker.ExecuteAsync(async () =>
        {
            var response = await _client.GetAsync(endpoint);
            response.EnsureSuccessStatusCode();
            return await response.Content.ReadFromJsonAsync<T>();
        });
    }
}
```

### Retry with Exponential Backoff

```csharp
var retryPolicy = Policy
    .Handle<HttpRequestException>()
    .WaitAndRetryAsync(
        retryCount: 3,
        sleepDurationProvider: attempt =>
            TimeSpan.FromSeconds(Math.Pow(2, attempt)), // 2, 4, 8 seconds
        onRetry: (ex, delay, attempt, context) =>
            Log.Warning("Retry {Attempt} after {Delay}s: {Error}",
                attempt, delay.TotalSeconds, ex.Message)
    );
```

### Bulkhead Pattern

```csharp
// Isolate failures to prevent cascade
var bulkhead = Policy.BulkheadAsync(
    maxParallelization: 10,      // Max concurrent executions
    maxQueuingActions: 20,       // Max queued requests
    onBulkheadRejectedAsync: context =>
    {
        Log.Warning("Bulkhead rejected request");
        return Task.CompletedTask;
    }
);
```

## Event-Driven Architecture

### Message Queue Pattern

```
┌─────────┐    ┌─────────────┐    ┌─────────────┐
│ Service │───►│   Message   │───►│  Consumer   │
│    A    │    │    Queue    │    │  Service B  │
└─────────┘    │             │    └─────────────┘
               │  (RabbitMQ, │
               │   Kafka,    │    ┌─────────────┐
               │   Azure SB) │───►│  Consumer   │
               └─────────────┘    │  Service C  │
                                  └─────────────┘
```

### Event Sourcing

```csharp
// Store events, not state
public class PatientAggregate
{
    private readonly List<IDomainEvent> _events = new();

    public Guid Id { get; private set; }
    public string Name { get; private set; }
    public PatientStatus Status { get; private set; }

    public void Apply(PatientCreated @event)
    {
        Id = @event.PatientId;
        Name = @event.Name;
        Status = PatientStatus.Active;
        _events.Add(@event);
    }

    public void Apply(PatientNameChanged @event)
    {
        Name = @event.NewName;
        _events.Add(@event);
    }

    // Rebuild state from events
    public static PatientAggregate FromEvents(IEnumerable<IDomainEvent> events)
    {
        var patient = new PatientAggregate();
        foreach (var @event in events)
        {
            patient.Apply((dynamic)@event);
        }
        return patient;
    }
}
```

## Quick Reference: Design Trade-offs

| Decision | Option A | Option B | Consider |
|----------|----------|----------|----------|
| Storage | SQL | NoSQL | Data structure, consistency needs |
| Caching | Redis | In-memory | Distributed needs, size |
| Communication | Sync (HTTP) | Async (Queue) | Coupling, latency tolerance |
| Consistency | Strong | Eventual | Business requirements |
| Scaling | Vertical | Horizontal | Cost, complexity, limits |

## System Design Checklist

- [ ] **Requirements**: Functional + Non-functional defined
- [ ] **Scale**: Expected users, requests/sec, data volume
- [ ] **Availability**: Uptime target (99.9% = 8.76h downtime/year)
- [ ] **Latency**: P50, P95, P99 targets
- [ ] **Data**: Storage type, retention, backup strategy
- [ ] **Caching**: What to cache, invalidation strategy
- [ ] **Security**: Auth, encryption, compliance
- [ ] **Monitoring**: Metrics, logging, alerting
- [ ] **Failure modes**: What happens when X fails?
- [ ] **Cost**: Infrastructure, operational overhead

## Related Skills

- `technical-design-patterns` - Document designs
- `api-design-principles` - API architecture
- `distributed-events-advanced` - Event patterns

Related Skills

ui-ux-design

16
from diegosouzapw/awesome-omni-skill

UI/UX design reference database. 50+ styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient.

ui-patterns

16
from diegosouzapw/awesome-omni-skill

Plaited UI patterns for templates, behavioral elements, and styling. Use when creating bElements or FunctionalTemplates, writing stories for testing, using createStyles, building form controls, or coordinating cross-island communication.

ui-design

16
from diegosouzapw/awesome-omni-skill

Opinionated constraints for building better interfaces with agents. Use when building UI components, implementing animations, designing layouts, reviewing frontend accessibility, or working with Tailwind CSS, motion/react, or accessible primitives like Radix/Base UI.

touchdesigner-api-lookup

16
from diegosouzapw/awesome-omni-skill

Query local TouchDesigner API documentation and class references. Use this skill when the user asks about specific TouchDesigner operators, Python classes, parameters, or methods.

tools-ui-frontend-design

16
from diegosouzapw/awesome-omni-skill

Create distinctive, production-grade frontend interfaces grounded in this repo's design system. Use when asked to build web components, pages, or applications. Combines bold creative direction with token-constrained implementation.

thehub-design-system

16
from diegosouzapw/awesome-omni-skill

Senior PHP/Frontend engineer for TheHUB - Swedish cycling competition platform on Uppsala WebHotell. Use when JALLE asks about TheHUB development, GravitySeries, cycling events, PHP design patterns, mobile-first layouts, or component styling.

testing-patterns

16
from diegosouzapw/awesome-omni-skill

TDD and unit testing guidance for Crispy CRM. Use when writing tests, implementing TDD, debugging test failures, or setting up test infrastructure. Covers Vitest patterns, React Admin component testing, Zod schema validation testing, Supabase mocking, E2E with Playwright, and manual E2E testing with Claude Chrome. Integrates with verification-before-completion for test verification.

Testing Anti-Patterns

16
from diegosouzapw/awesome-omni-skill

This skill should be used when encountering "flaky tests", "test maintenance issues", "slow test suites", "brittle tests", "test code smells", "test debugging problems", or when tests are hard to understand, maintain, or debug.

tailwind-patterns

16
from diegosouzapw/awesome-omni-skill

Tailwind CSS v4 principles. CSS-first configuration, container queries, modern patterns, design token architecture.

systems-programming-rust-project

16
from diegosouzapw/awesome-omni-skill

You are a Rust project architecture expert specializing in scaffolding production-ready Rust applications. Generate complete project structures with cargo tooling, proper module organization, testing

systematic-debugging

16
from diegosouzapw/awesome-omni-skill

Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

system-design

16
from diegosouzapw/awesome-omni-skill

Comprehensive system design skill for creating professional software architecture specifications. Use this skill when asked to design systems (e.g., "Design a chat application", "Design an e-commerce platform", "Create system architecture for X"). Generates complete technical specifications with architecture diagrams, database schemas, API designs, scalability plans, security considerations, and deployment strategies. Creates organized spec folders with all documentation following professional software engineering standards, from high-level overview down to detailed implementation specifications.