nw-sd-patterns
Core distributed systems patterns - load balancing, caching, sharding, consistent hashing, message queues, rate limiting, CDN, Bloom filters, ID generation, replication, conflict resolution, CAP theorem
Best use case
nw-sd-patterns is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Core distributed systems patterns - load balancing, caching, sharding, consistent hashing, message queues, rate limiting, CDN, Bloom filters, ID generation, replication, conflict resolution, CAP theorem
Teams using nw-sd-patterns should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/nw-sd-patterns/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How nw-sd-patterns Compares
| Feature / Agent | nw-sd-patterns | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Core distributed systems patterns - load balancing, caching, sharding, consistent hashing, message queues, rate limiting, CDN, Bloom filters, ID generation, replication, conflict resolution, CAP theorem
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Core Distributed Systems Patterns ## Load Balancing **Problem**: single server can't handle all traffic. **Approaches**: Round Robin (simple, ignores load) | Weighted Round Robin (accounts for capacity) | Least Connections (fewest active) | IP Hash (session affinity) | Layer 4/transport (IP/port, fast) | Layer 7/application (HTTP-aware, smarter) **Placement**: client-to-web | web-to-app | app-to-database **Trade-offs**: LB itself is SPOF -- use active-passive pair | session affinity complicates horizontal scaling -- prefer stateless servers | health checks critical ## Caching **Problem**: repeated DB reads are slow. **Strategies**: Cache-aside/lazy loading (app checks cache, fills on miss -- most common) | Write-through (write cache+DB simultaneously) | Write-behind (cache only, async to DB) | Read-through (cache fronts DB transparently) **Cache-aside pattern**: Read: `cache.get(key) -> hit? return : db.read -> cache.set -> return` | Write: `db.write -> cache.delete(key)` **Eviction**: LRU (most common) | LFU (skewed access) | TTL (time-based) **Problems**: thundering herd (many misses simultaneously -- use locking/coalescing) | cache penetration (non-existent keys -- Bloom filter or cache null) | cache avalanche (mass expiration -- jittered TTLs) | size cache based on working set, not total data ## Database Replication **Master-Slave**: all writes to master, reads to replicas | replication lag = eventual consistency | master fails: promote replica **Multi-Master**: writes to any node, conflict resolution required | better write availability, much more complex | suitable for multi-region **Trade-offs**: sync replication = consistency but higher write latency | async = lower latency but data loss risk on failure ## Database Sharding **Problem**: single DB can't handle write volume or data size. **Strategies**: Hash-based (hash(key) % N -- even but resharding painful) | Range-based (ranges, can have hotspots) | Directory-based (lookup table, flexible but SPOF) **Partition key**: must distribute data AND queries evenly | must be in most queries | common: user_id, tenant_id, region **Challenges**: resharding (consistent hashing helps) | celebrity/hotspot problem | cross-shard joins (expensive -- denormalize) | referential integrity (enforce in app) | schema changes across all shards ## Consistent Hashing **Problem**: traditional hash(key) % N remaps almost all keys when N changes. **How**: hash output space as ring (0 to 2^32-1) | servers at positions on ring | keys walk clockwise to first server | adding/removing server affects only adjacent keys **Virtual nodes**: each physical server gets 100-200 positions | ensures even distribution | handles heterogeneous capacities **Used in**: DynamoDB, Cassandra, Discord, Akamai CDN ## Message Queues **Problem**: tight coupling; spikes overwhelm downstream. **Properties**: decoupling | buffering (absorbs spikes) | async processing | guaranteed delivery **Patterns**: Point-to-point (one consumer per message) | Pub/Sub (all subscribers get message) | Dead letter queue (failed messages for debugging) **When**: email/notification sending | image/video processing | analytics ingestion | cross-service communication | any op where user doesn't need immediate result **Technologies**: Kafka (high throughput, log-based, event streaming) | RabbitMQ (flexible routing, task queues) | SQS (managed, AWS) | Redis Streams (lightweight) ## Rate Limiting **Problem**: protect services from abuse and cascading overload. | Algorithm | Mechanism | Pros | Cons | |-----------|-----------|------|------| | Token Bucket | tokens refill at fixed rate | allows bursts, simple | memory per user | | Leaking Bucket | queue with fixed processing rate | smooth output | no burst flexibility | | Fixed Window | count per time window | simple | burst at edges | | Sliding Window Log | track each request timestamp | precise | memory-intensive | | Sliding Window Counter | hybrid fixed + weighted | good balance | approximate | Token Bucket is industry standard (AWS, Stripe, GitHub). Implementation: API gateway or per-service | Redis counters with TTL | return 429 with Retry-After and X-RateLimit headers ## CDN **Problem**: static content from origin adds latency for distant users. **How**: assets cached at edge servers worldwide | DNS routes to nearest edge | cache miss fetches from origin **Push vs Pull**: Push (upload to CDN, infrequent changes) | Pull (CDN fetches on first request, simpler) **Invalidation**: URL versioning (preferred) | CDN API purge | TTL expiration ## Bloom Filters **Problem**: quickly check "is X in set?" without storing full set. **How**: bit array + k hash functions | insert sets k bits | query checks k bits | false positives possible, false negatives impossible **Used for**: cache penetration prevention | duplicate URL detection (crawlers) | spam filtering **Config**: 10 bits per element ~ 1% false positive rate | cannot delete (use Counting Bloom Filter) ## Unique ID Generation | Approach | Sortable | Size | Coordination | Throughput | |----------|----------|------|-------------|------------| | UUID v4 | No | 128b | None | Unlimited | | DB auto-inc | Yes | 64b | High | Limited | | Ticket server | Yes | 64b | Medium | Limited | | Snowflake | Yes | 64b | Minimal | Very high | **Snowflake**: `[1 unused | 41 timestamp | 5 datacenter | 5 machine | 12 sequence]` -- ~4M IDs/sec/DC | clock sync via NTP is Achilles heel ## Fan-out Strategies **Fan-out on write (push)**: post immediately written to all followers' feeds | read is instant | expensive for celebrities **Fan-out on read (pull)**: feed computed at read time | write is fast | read is slow **Hybrid (production)**: push for normal users | pull for celebrities (>10K followers) ## Real-time Communication **Long Polling**: server holds request open until data or timeout | simple, resource-intensive **WebSocket**: full-duplex persistent | low latency | stateful (complicates LB -- need sticky sessions) **SSE**: server pushes over HTTP | unidirectional | auto-reconnect | simpler for notification/feed ## Geohashing and Spatial Indexing **Geohash**: encodes lat/lon to string, nearby share prefix | precision by length (4=39km, 6=1.2km, 8=38m) | boundary problem: query target + 8 neighbors **Quadtree**: recursive subdivision into 4 quadrants | adaptive to density | in-memory, 200M items ~1.7GB **Geohash vs Quadtree**: geohash simpler (string prefix), DB-friendly | quadtree adaptive to density, in-memory only ## Data Replication Strategies **Single-leader**: one primary writes, replicas read | simple but SPOF **Multi-leader**: multiple write nodes, conflict resolution | better for multi-DC **Leaderless (Dynamo)**: any node reads/writes | quorum W+R>N | W=1,R=N fast writes | W=N,R=1 fast reads | W=R=N/2+1 balanced | anti-entropy + read repair ## Conflict Resolution **LWW**: timestamp-based, simple but lossy | **Vector clocks**: detect conflicts, app resolves | **CRDTs**: auto-merge data types | **Application-level**: present to user (like Git) ## CAP Theorem **CP (consistency)**: reject writes during partition | HBase, MongoDB, Redis Cluster | financial transactions **AP (availability)**: accept writes, resolve later | Cassandra, DynamoDB, CouchDB | social feeds, shopping carts **Real question**: "what happens during network partition?" | most systems need availability for reads, consistency for certain writes | tunable consistency (Cassandra) gives flexibility ## Write-Ahead Log (WAL) Before applying mutation, write to append-only log | acknowledge to client | periodically apply to data structure | on crash: replay from last checkpoint. Used in PostgreSQL, MySQL, Cassandra, Kafka. ## Gossip Protocol Each node maintains member list with heartbeat counters | periodically exchanges state with random peer | propagates in O(log N) rounds. Used for membership, failure detection, config propagation. ## Trie (Prefix Tree) Each node = character, root-to-leaf = string | optimizations: compress single-child chains, cache top results at each node, shard by first character. Used in: search autocomplete, spell checking, IP routing.
Related Skills
nw-ux-web-patterns
Web UI design patterns for product owners. Load when designing web application interfaces, writing web-specific acceptance criteria, or evaluating responsive designs.
nw-ux-tui-patterns
Terminal UI and CLI design patterns for product owners. Load when designing command-line tools, interactive terminal applications, or writing CLI-specific acceptance criteria.
nw-ux-desktop-patterns
Desktop application UI patterns for product owners. Load when designing native or cross-platform desktop applications, writing desktop-specific acceptance criteria, or evaluating panel layouts and keyboard workflows.
nw-sd-patterns-advanced
Advanced distributed patterns - event sourcing, CQRS, saga, stream processing, append-only log, exactly-once delivery, sequencer, double-entry ledger, erasure coding, order book, watermarks
nw-design-patterns
7 agentic design patterns with decision tree for choosing the right pattern for each agent type
nw-data-architecture-patterns
Data architecture patterns (warehouse, lake, lakehouse, mesh), ETL/ELT pipelines, streaming architectures, scaling strategies, and schema design patterns
nw-command-design-patterns
Best practices for command definition files - size targets, declarative template, anti-patterns, and canonical examples based on research evidence
nw-architecture-patterns
Comprehensive architecture patterns, methodologies, quality frameworks, and evaluation methods for solution architects. Load when designing system architecture or selecting patterns.
nw-ux-principles
Core UX principles for product owners. Load when evaluating interface designs, writing acceptance criteria with UX requirements, or reviewing wireframes and mockups.
nw-ux-emotional-design
Emotional design and delight patterns for product owners. Load when designing onboarding flows, empty states, first-run experiences, or evaluating the emotional quality of an interface.
nw-user-story-mapping
User story mapping for backlog management and outcome-based prioritization. Load during Phase 2.5 (User Story Mapping) to produce story-map.md and prioritization.md.
nw-tr-review-criteria
Review dimensions and scoring for root cause analysis quality assessment