microservices-architecture
Microservices architecture patterns and best practices. Use when designing distributed systems, breaking down monoliths, or implementing service communication.
Best use case
microservices-architecture is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Microservices architecture patterns and best practices. Use when designing distributed systems, breaking down monoliths, or implementing service communication.
Teams using microservices-architecture should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/microservices-architecture/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How microservices-architecture Compares
| Feature / Agent | microservices-architecture | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Microservices architecture patterns and best practices. Use when designing distributed systems, breaking down monoliths, or implementing service communication.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Microservices Architecture
Comprehensive guide for designing and implementing microservices-based systems.
## Microservices Fundamentals
### What are Microservices?
```
MICROSERVICES = Independently deployable services
that do one thing well
Characteristics:
┌─────────────────────────────────────────┐
│ ✓ Single responsibility │
│ ✓ Own their data │
│ ✓ Independently deployable │
│ ✓ Communicate via APIs │
│ ✓ Technology agnostic │
│ ✓ Owned by small teams │
└─────────────────────────────────────────┘
```
### Monolith vs Microservices
```
MONOLITH:
┌─────────────────────────────────────────┐
│ Single Application │
│ ┌──────┬──────┬──────┬──────┐ │
│ │ Users│Orders│ Cart │Search│ │
│ └──────┴──────┴──────┴──────┘ │
│ Single Database │
└─────────────────────────────────────────┘
MICROSERVICES:
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Users │ │Orders│ │ Cart │ │Search│
│ DB │ │ DB │ │ DB │ │ DB │
└──────┘ └──────┘ └──────┘ └──────┘
│ │ │ │
└─────────┴─────────┴─────────┘
API Gateway
```
### When to Use Microservices
```
USE WHEN:
✓ Large, complex domain
✓ Need independent scaling
✓ Multiple teams working in parallel
✓ Different technology needs per service
✓ Fault isolation is critical
✓ Frequent, independent deployments
DON'T USE WHEN:
✗ Small team/application
✗ Simple domain
✗ Tight latency requirements
✗ Limited DevOps maturity
✗ Unclear domain boundaries
```
---
## Service Design
### Domain-Driven Design
```
BOUNDED CONTEXTS:
┌─────────────────┐ ┌─────────────────┐
│ Orders │ │ Shipping │
│ Context │ │ Context │
│ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Order │ │ │ │ Shipment │ │
│ │ LineItem │ │ │ │ Carrier │ │
│ │ Customer(ID)│ │ │ │ Address │ │
│ └─────────────┘ │ │ └─────────────┘ │
└─────────────────┘ └─────────────────┘
Each context has its own:
- Ubiquitous language
- Data model
- Business rules
```
### Service Boundaries
```
GOOD boundaries follow:
┌─────────────────────────────────────────┐
│ Business Capability │
│ - What the business does │
│ - e.g., Payment Processing, Inventory │
├─────────────────────────────────────────┤
│ Subdomain │
│ - Area of expertise │
│ - e.g., Pricing, Catalog, Customer │
├─────────────────────────────────────────┤
│ Single Responsibility │
│ - Does one thing well │
│ - Can explain in one sentence │
└─────────────────────────────────────────┘
BAD boundaries:
✗ Technical layers (UI service, DB service)
✗ CRUD operations (User CRUD service)
✗ Too granular (EmailSender service)
```
### Service Size Guidelines
```
Right-sized service:
- 2-pizza team can own it (5-8 people)
- Rewrite in 2-4 weeks if needed
- Clear, single business purpose
- Minimal external dependencies
- Own its data completely
Too big: Multiple teams needed, mixed concerns
Too small: Can't function independently
```
---
## Communication Patterns
### Synchronous (Request/Response)
```
REST:
┌──────┐ HTTP GET /users/123 ┌──────┐
│Client│ ─────────────────────>│Server│
│ │<───────────────────── │ │
└──────┘ { "name": "John" } └──────┘
gRPC:
┌──────┐ Binary/Protobuf ┌──────┐
│Client│ ─────────────────────>│Server│
│ │<───────────────────── │ │
└──────┘ Strongly typed └──────┘
GraphQL:
┌──────┐ POST /graphql ┌──────┐
│Client│ ─────────────────────>│Server│
│ │<───────────────────── │ │
└──────┘ Flexible queries └──────┘
```
### Asynchronous (Event-Driven)
```
MESSAGE QUEUE:
┌────────┐ ┌─────────┐ ┌────────┐
│Producer│ ───> │ Queue │ ───> │Consumer│
└────────┘ │(RabbitMQ│ └────────┘
│ SQS) │
└─────────┘
EVENT STREAMING:
┌────────┐ ┌─────────┐ ┌────────┐
│Producer│ ───> │ Topic │ ───> │Consumer│
└────────┘ │ (Kafka) │ ───> │Consumer│
└─────────┘ ───> │Consumer│
└────────┘
```
### Communication Comparison
| Pattern | Use Case | Trade-offs |
| ------------------- | -------------------------- | ---------------------- |
| **REST** | CRUD, simple queries | Simple, but chatty |
| **gRPC** | High performance, internal | Fast, but complex |
| **GraphQL** | Flexible client needs | Flexible, but overhead |
| **Message Queue** | Task processing | Decoupled, but delay |
| **Event Streaming** | Event sourcing, analytics | Scalable, but complex |
---
## Data Management
### Database per Service
```
SEPARATE DATABASES:
┌────────┐ ┌────────┐ ┌────────┐
│Users │ │Orders │ │Products│
│Service │ │Service │ │Service │
├────────┤ ├────────┤ ├────────┤
│PostgreSQL│ │MongoDB │ │MySQL │
└────────┘ └────────┘ └────────┘
Benefits:
✓ Independent scaling
✓ Technology freedom
✓ No shared schema coupling
✓ Fault isolation
Challenges:
- No joins across services
- Eventual consistency
- Data duplication
```
### Data Consistency Patterns
```
SAGA PATTERN (Choreography):
┌──────┐ ┌──────┐ ┌──────┐
│Order │─────>│Payment│─────>│Ship │
│Create│ │Process│ │Order │
└──────┘ └──────┘ └──────┘
│ │ │
└─────────────┴─────────────┘
Each service publishes events
that trigger next step
SAGA PATTERN (Orchestration):
┌────────────┐
│Orchestrator│
└────────────┘
/ │ \
↓ ↓ ↓
┌──────┐ ┌──────┐ ┌──────┐
│Order │ │Payment│ │Ship │
└──────┘ └──────┘ └──────┘
Central coordinator manages flow
```
### Event Sourcing
```
Instead of storing current state:
┌────────────────────────────┐
│ Account: $500 │
└────────────────────────────┘
Store all events:
┌────────────────────────────┐
│ 1. AccountCreated $0 │
│ 2. Deposited $1000 │
│ 3. Withdrawn $300 │
│ 4. Withdrawn $200 │
│ Current: $500 │
└────────────────────────────┘
Benefits:
✓ Complete audit trail
✓ Temporal queries
✓ Event replay
✓ Natural fit for CQRS
```
---
## API Gateway
### Gateway Pattern
```
┌─────────────┐
│ API Gateway │
└──────┬──────┘
┌──────────┬──┴──┬──────────┐
↓ ↓ ↓ ↓
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Users │ │Orders │ │Products│ │Auth │
│Service │ │Service │ │Service │ │Service │
└────────┘ └────────┘ └────────┘ └────────┘
Gateway responsibilities:
- Request routing
- Authentication/Authorization
- Rate limiting
- Load balancing
- Request/Response transformation
- Caching
- Monitoring
```
### BFF (Backend for Frontend)
```
Mobile App Web App
│ │
↓ ↓
┌────────────┐ ┌────────────┐
│ Mobile BFF │ │ Web BFF │
└─────┬──────┘ └─────┬──────┘
│ │
┌──────┴────────────────┴──────┐
│ Internal APIs │
└───────────────────────────────┘
Each BFF:
- Optimized for its client
- Aggregates multiple services
- Handles client-specific logic
```
---
## Service Discovery
### Client-Side Discovery
```
┌──────┐ ┌──────────────┐
│Client│────>│Service │────> Service A (192.168.1.10)
│ │ │Registry │────> Service A (192.168.1.11)
│ │<────│(Consul, etcd)│────> Service A (192.168.1.12)
└──────┘ └──────────────┘
Client queries registry, then calls service directly.
Client handles load balancing.
```
### Server-Side Discovery
```
┌──────┐ ┌─────────────┐ ┌──────────────┐
│Client│────>│Load Balancer│────>│Service │
│ │ │ │ │Registry │
└──────┘ └─────────────┘ └──────────────┘
│
┌────────┴────────┐
↓ ↓ ↓
Service Service Service
Load balancer handles discovery and routing.
Simpler for clients.
```
---
## Resilience Patterns
### Circuit Breaker
```
States:
┌────────┐ Failures ┌────────┐
│ CLOSED │───────────────>│ OPEN │
│(normal)│ │(reject)│
└────────┘ └────────┘
↑ │
│ Timeout │
│ ┌────────────┐ │
└───│HALF-OPEN │<────────┘
│(test) │
└────────────┘
Implementation:
- Track failure count
- Open circuit after threshold
- Reject calls while open
- Periodically test with half-open
- Close circuit on success
```
### Retry Pattern
```typescript
async function withRetry<T>(
fn: () => Promise<T>,
maxAttempts: number = 3,
backoff: number = 1000,
): Promise<T> {
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
try {
return await fn();
} catch (error) {
if (attempt === maxAttempts) throw error;
// Exponential backoff with jitter
const delay = backoff * Math.pow(2, attempt - 1);
const jitter = delay * 0.1 * Math.random();
await sleep(delay + jitter);
}
}
}
```
### Bulkhead Pattern
```
Isolate resources to prevent cascade:
┌─────────────────────────────────────┐
│ Service A │
│ ┌─────────┐ ┌─────────┐ │
│ │Thread │ │Thread │ │
│ │Pool 1 │ │Pool 2 │ │
│ │(Service │ │(Service │ │
│ │ B) │ │ C) │ │
│ └─────────┘ └─────────┘ │
└─────────────────────────────────────┘
If Service C is slow, only Pool 2 is affected.
Service B calls continue normally.
```
### Timeout Pattern
```typescript
async function withTimeout<T>(fn: () => Promise<T>, ms: number): Promise<T> {
return Promise.race([
fn(),
new Promise<T>((_, reject) =>
setTimeout(() => reject(new Error("Timeout")), ms),
),
]);
}
// Always set timeouts on external calls
const user = await withTimeout(
() => userService.getUser(id),
5000, // 5 second timeout
);
```
---
## Observability
### The Three Pillars
```
LOGS:
┌─────────────────────────────────────────┐
│ 2024-01-15 10:30:45 [INFO] OrderService │
│ Order created: { id: 123, user: 456 } │
└─────────────────────────────────────────┘
METRICS:
┌─────────────────────────────────────────┐
│ order_created_total: 1523 │
│ order_processing_seconds: 0.234 │
│ active_connections: 45 │
└─────────────────────────────────────────┘
TRACES:
┌─────────────────────────────────────────┐
│ [Request ID: abc123] │
│ ├─ Gateway: 2ms │
│ ├─ OrderService: 150ms │
│ │ ├─ UserService: 45ms │
│ │ └─ InventoryService: 80ms │
│ └─ Total: 152ms │
└─────────────────────────────────────────┘
```
### Distributed Tracing
```
Trace Context Propagation:
┌──────┐ X-Trace-ID: abc ┌──────┐
│API │─────────────────>│Order │
│GW │ │Svc │
└──────┘ └──────┘
│
X-Trace-ID: abc │
↓
┌──────┐
│User │
│Svc │
└──────┘
Tools: Jaeger, Zipkin, AWS X-Ray, Datadog
```
### Health Checks
```typescript
// Liveness: Is the service running?
app.get("/health/live", (req, res) => {
res.status(200).json({ status: "alive" });
});
// Readiness: Is the service ready to handle traffic?
app.get("/health/ready", async (req, res) => {
const dbHealthy = await checkDatabase();
const cacheHealthy = await checkCache();
if (dbHealthy && cacheHealthy) {
res.status(200).json({ status: "ready" });
} else {
res.status(503).json({
status: "not ready",
checks: { database: dbHealthy, cache: cacheHealthy },
});
}
});
```
---
## Deployment
### Containerization
```dockerfile
# Dockerfile
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY dist ./dist
USER node
EXPOSE 3000
CMD ["node", "dist/main.js"]
```
### Kubernetes Basics
```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
replicas: 3
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: order-service:1.0.0
ports:
- containerPort: 3000
resources:
limits:
cpu: "500m"
memory: "256Mi"
livenessProbe:
httpGet:
path: /health/live
port: 3000
readinessProbe:
httpGet:
path: /health/ready
port: 3000
```
---
## Testing Strategies
### Test Pyramid for Microservices
```
/\
/ \ E2E Tests
/────\ (Few, slow, brittle)
/ \
/────────\ Contract Tests
/ \ (Service boundaries)
/────────────\
/ \ Integration Tests
/────────────────\ (With dependencies)
/ \
/────────────────────\ Unit Tests
/ \ (Many, fast, isolated)
/________________________\
```
### Contract Testing
```typescript
// Consumer test (Order Service)
describe("User Service Contract", () => {
it("returns user by ID", async () => {
// Define expected interaction
await provider.addInteraction({
state: "user 123 exists",
uponReceiving: "a request for user 123",
withRequest: {
method: "GET",
path: "/users/123",
},
willRespondWith: {
status: 200,
body: {
id: "123",
name: like("John"),
email: like("john@example.com"),
},
},
});
// Test passes if consumer expectations match
});
});
```
---
## Best Practices
### DO:
- Start with a monolith, extract services later
- Define clear service boundaries
- Use asynchronous communication where possible
- Implement circuit breakers
- Centralize logging and monitoring
- Automate everything
- Design for failure
- Version your APIs
### DON'T:
- Create too many, too small services
- Share databases between services
- Make synchronous chains too deep
- Ignore distributed system complexities
- Couple services through shared libraries
- Skip contract testing
- Deploy without monitoring
---
## Migration Checklist
### Monolith to Microservices
- [ ] Map domain boundaries clearly
- [ ] Identify candidate services (start small)
- [ ] Establish CI/CD for new services
- [ ] Implement API gateway
- [ ] Set up service discovery
- [ ] Add distributed tracing
- [ ] Implement circuit breakers
- [ ] Extract first service (strangler fig pattern)
- [ ] Test extensively
- [ ] Monitor and iterateRelated Skills
microservices-patterns
Design microservices architectures with service boundaries, event-driven communication, and resilience patterns. Use when building distributed systems, decomposing monoliths, or implementing micros...
architecture-patterns
Padrões de arquitetura de software - Decisões OBJETIVAS sobre design de sistemas
rails-architecture
Guides modern Rails 8 code architecture decisions and patterns. Use when deciding where to put code, choosing between patterns (service objects vs concerns vs query objects), designing feature architecture, refactoring for better organization, or when user mentions architecture, code organization, design patterns, or layered design.
mvvm-architecture
Expert MVVM decisions for iOS/tvOS: choosing between ViewModel patterns (state enum vs published properties vs Combine), service layer boundaries, dependency injection strategies, and testing approaches. Use when designing ViewModel architecture, debugging data flow issues, or deciding where business logic belongs. Trigger keywords: MVVM, ViewModel, ObservableObject, @StateObject, service layer, dependency injection, unit test, mock, architecture
MCP Architecture Expert
Design and implement Model Context Protocol servers for standardized AI-to-data integration with resources, tools, prompts, and security best practices
architecture-paradigm-pipeline
Consult this skill when designing data pipelines or transformation workflows. Use when data flows through fixed sequence of transformations, stages can be independently developed and tested, parallel processing of stages is beneficial. Do not use when selecting from multiple paradigms - use architecture-paradigms first. DO NOT use when: data flow is not sequential or predictable. DO NOT use when: complex branching/merging logic dominates.
architecture-advisor
Helps solo developers with AI agents choose optimal architecture (monolithic/microservices/hybrid)
agent-native-architecture
Build applications where agents are first-class citizens. Use this skill when designing autonomous agents, creating MCP tools, implementing self-modifying systems, or building apps where features are outcomes achieved by agents operating in a loop.
agent-architecture
Use when designing or implementing AI agent systems. Covers tool-using agents with mandatory guardrails, SSE streaming (FastAPI → Next.js via Vercel AI SDK v6), LangGraph stateful multi-agent graphs, episodic memory via pgvector, MCP overview, and production failure modes with anti-pattern/fix code pairs.
u07820-attention-management-architecture-for-personal-finance-management
Build and operate the "Attention Management Architecture for personal finance management" capability for personal finance management. Use when this exact capability is required by autonomous or human-guided missions.
Microservices Communication
Thiết kế kiến trúc giao tiếp Microservices (gRPC, message queues, event-driven pattern).
MCP Server Architecture
This skill should be used when the user asks to "create an MCP server", "set up MCP server", "build ChatGPT app backend", "MCP transport type", "configure MCP endpoint", "server setup for Apps SDK", or needs guidance on MCP server architecture, transport protocols, or SDK setup for the OpenAI Apps SDK.