managing-database-replication
Process use when you need to work with database scalability. This skill provides replication and sharding with comprehensive guidance and automation. Trigger with phrases like "set up replication", "implement sharding", or "scale database".
Best use case
managing-database-replication is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Process use when you need to work with database scalability. This skill provides replication and sharding with comprehensive guidance and automation. Trigger with phrases like "set up replication", "implement sharding", or "scale database".
Teams using managing-database-replication should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/managing-database-replication/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How managing-database-replication Compares
| Feature / Agent | managing-database-replication | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Process use when you need to work with database scalability. This skill provides replication and sharding with comprehensive guidance and automation. Trigger with phrases like "set up replication", "implement sharding", or "scale database".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Database Replication Manager
## Overview
Configure and manage database replication topologies for PostgreSQL (streaming replication, logical replication), MySQL (source-replica, group replication), and MongoDB (replica sets). This skill covers primary-replica setup, read scaling through replica routing, failover automation, replication lag monitoring, and conflict resolution for multi-primary configurations.
## Prerequisites
- Superuser or replication-role credentials on primary and replica servers
- Network connectivity between all replication nodes (verify with `pg_isready` or `mysqladmin ping`)
- `psql`, `mysql`, or `mongosh` CLI tools installed on all nodes
- Matching major database versions across all replication nodes
- Sufficient disk space on replicas (equal to or greater than primary)
- SSH access to replica servers for initial base backup transfer
## Instructions
1. Choose the replication topology based on requirements:
- **Single primary + read replicas**: Best for read-heavy workloads. All writes go to primary; reads distributed across replicas.
- **Multi-primary (active-active)**: Best for geographic distribution. Requires conflict resolution. Use PostgreSQL logical replication or MySQL Group Replication.
- **Cascading replication**: Replica A replicates from primary, Replica B replicates from Replica A. Reduces primary load for many replicas.
2. For PostgreSQL streaming replication, configure the primary:
- Set `wal_level = replica`, `max_wal_senders = 10`, `max_replication_slots = 10`
- Create replication user: `CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password'`
- Add replication entry to `pg_hba.conf`: `host replication replicator replica_ip/32 scram-sha-256`
- Reload configuration: `SELECT pg_reload_conf()`
3. Initialize the replica with a base backup: `pg_basebackup -h primary_host -U replicator -D /var/lib/postgresql/data -Fp -Xs -P -R`. The `-R` flag creates `standby.signal` and configures `primary_conninfo` automatically.
4. For MySQL source-replica replication, configure the source:
- Set `server-id = 1`, `log_bin = mysql-bin`, `binlog_format = ROW`
- Create replication user: `CREATE USER 'replicator'@'replica_ip' IDENTIFIED BY 'secure_password'; GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'replica_ip'`
- Record binary log position: `SHOW MASTER STATUS`
5. Configure the MySQL replica: `CHANGE REPLICATION SOURCE TO SOURCE_HOST='primary_host', SOURCE_USER='replicator', SOURCE_PASSWORD='...', SOURCE_LOG_FILE='mysql-bin.000001', SOURCE_LOG_POS=154; START REPLICA`.
6. For MongoDB replica sets: initialize with `rs.initiate({_id: "rs0", members: [{_id: 0, host: "node1:27017"}, {_id: 1, host: "node2:27017"}, {_id: 2, host: "node3:27017"}]})`. MongoDB handles leader election and failover automatically.
7. Monitor replication lag continuously:
- PostgreSQL: `SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, (sent_lsn - replay_lsn) AS lag_bytes FROM pg_stat_replication`
- MySQL: `SHOW REPLICA STATUS\G` (check `Seconds_Behind_Source`)
- MongoDB: `rs.printReplicationInfo()` and `rs.printSecondaryReplicationInfo()`
8. Configure application-level read routing: direct write queries to the primary connection and read queries to a load-balanced replica pool. Use connection poolers (PgBouncer, ProxySQL) or application middleware for automatic routing.
9. Set up automated failover using Patroni (PostgreSQL), MySQL InnoDB Cluster + MySQL Router, or MongoDB's built-in election mechanism. Test failover by stopping the primary and verifying the replica promotes automatically within the target RTO.
10. Configure alerting for replication lag exceeding 10 seconds, replication slot inactive for more than 1 hour, and replica connection drops. Stale replication slots in PostgreSQL cause WAL accumulation and can fill the disk.
## Output
- **Replication configuration files** for primary and replica nodes
- **Base backup and initialization scripts** for setting up new replicas
- **Replication monitoring queries** with lag measurement and health checks
- **Failover runbook** with manual and automated promotion procedures
- **Read routing configuration** for application or connection pooler
## Error Handling
| Error | Cause | Solution |
|-------|-------|---------|
| Replication lag increasing steadily | Replica cannot keep up with primary write volume | Check replica I/O and CPU; increase `max_parallel_workers` on replica; consider upgrading replica hardware; reduce write-heavy batch operations on primary |
| `replication slot is inactive` warning | Replica disconnected or paused, WAL accumulating on primary | Reconnect replica; if permanently removed, drop the slot with `SELECT pg_drop_replication_slot('slot_name')` to prevent disk fill |
| `could not connect to primary` on replica | Network partition, primary down, or authentication failure | Verify network connectivity; check `pg_hba.conf` entries; confirm replication user credentials; check primary process status |
| Replica has diverged from primary | Split-brain after failed failover or manual writes to replica | Re-initialize replica from fresh base backup; for PostgreSQL, use `pg_rewind` if timeline divergence is small |
| Conflict in logical replication | Same row modified on both publisher and subscriber | Configure conflict resolution policy; use `UPDATE` conflict handler; design schema to avoid cross-node writes to same rows |
## Examples
**Setting up PostgreSQL read replicas for a web application**: A primary database handles 2,000 writes/second but read traffic is 10x higher. Two streaming replicas are added with PgBouncer routing read queries to replicas in round-robin. Result: primary CPU drops from 90% to 40%, read latency improves by 60%.
**Automated failover with Patroni**: A 3-node PostgreSQL cluster managed by Patroni with etcd for consensus. When the primary fails, Patroni promotes the most up-to-date replica within 10 seconds. Application reconnects automatically through the Patroni-managed VIP or DNS endpoint.
**Cross-region logical replication for compliance**: EU customer data must stay in EU region. Logical replication publishes only non-PII tables to the US region replica. EU application reads locally; US analytics queries use the replicated subset. Publication filter: `CREATE PUBLICATION us_analytics FOR TABLE orders, products, categories`.
## Resources
- PostgreSQL streaming replication: https://www.postgresql.org/docs/current/warm-standby.html
- PostgreSQL logical replication: https://www.postgresql.org/docs/current/logical-replication.html
- MySQL replication: https://dev.mysql.com/doc/refman/8.0/en/replication.html
- Patroni (PostgreSQL HA): https://patroni.readthedocs.io/
- MongoDB replica sets: https://www.mongodb.com/docs/manual/replication/Related Skills
validating-database-integrity
Process use when you need to ensure database integrity through comprehensive data validation. This skill validates data types, ranges, formats, referential integrity, and business rules. Trigger with phrases like "validate database data", "implement data validation rules", "enforce data integrity constraints", or "validate data formats".
managing-test-environments
This skill enables Claude to manage isolated test environments using Docker Compose, Testcontainers, and environment variables. It is used to create consistent, reproducible testing environments for software projects. Claude should use this skill when the user needs to set up a test environment with specific configurations, manage Docker Compose files for test infrastructure, set up programmatic container management with Testcontainers, manage environment variables for tests, or ensure cleanup after tests. Trigger terms include "test environment", "docker compose", "testcontainers", "environment variables", "isolated environment", "env-setup", and "test setup".
managing-autonomous-development
Enables Claude to manage Sugar's autonomous development workflows. It allows Claude to create tasks, view the status of the system, review pending tasks, and start autonomous execution mode. Use this skill when the user asks to create a new development task using `/sugar-task`, check the system status with `/sugar-status`, review pending tasks via `/sugar-review`, or initiate autonomous development using `/sugar-run`. It provides a comprehensive interface for interacting with the Sugar autonomous development system.
managing-ssltls-certificates
This skill enables Claude to manage and monitor SSL/TLS certificates using the ssl-certificate-manager plugin. It is activated when the user requests actions related to SSL certificates, such as checking certificate expiry, renewing certificates, or listing installed certificates. Use this skill when the user mentions "SSL certificate", "TLS certificate", "certificate expiry", "renew certificate", or similar phrases related to SSL/TLS certificate management. The plugin can list, check, and renew certificates, providing vital information for maintaining secure connections.
managing-snapshot-tests
This skill enables Claude to manage and update snapshot tests using intelligent diff analysis and selective updates. It is triggered when the user asks to analyze snapshot failures, update snapshots, or manage snapshot tests in general. It helps distinguish intentional changes from regressions, selectively update snapshots, and validate snapshot integrity. Use this when the user mentions "snapshot tests", "update snapshots", "snapshot failures", or requests to run "/snapshot-manager" or "/sm". It supports Jest, Vitest, Playwright, and Storybook frameworks.
scanning-database-security
Process use when you need to work with security and compliance. This skill provides security scanning and vulnerability detection with comprehensive guidance and automation. Trigger with phrases like "scan for vulnerabilities", "implement security controls", or "audit security".
optimizing-database-connection-pooling
Process use when you need to work with connection management. This skill provides connection pooling and management with comprehensive guidance and automation. Trigger with phrases like "manage connections", "configure pooling", or "optimize connection usage".
managing-network-policies
This skill enables Claude to manage Kubernetes network policies and firewall rules. It allows Claude to generate configurations and setup code based on specific requirements and infrastructure. Use this skill when the user requests to create, modify, or analyze network policies for Kubernetes, or when the user mentions "network-policy", "firewall rules", or "Kubernetes security". This skill is useful for implementing best practices and production-ready configurations for network security in a Kubernetes environment.
monitoring-database-transactions
Monitor use when you need to work with monitoring and observability. This skill provides health monitoring and alerting with comprehensive guidance and automation. Trigger with phrases like "monitor system health", "set up alerts", or "track metrics".
monitoring-database-health
Monitor use when you need to work with monitoring and observability. This skill provides health monitoring and alerting with comprehensive guidance and automation. Trigger with phrases like "monitor system health", "set up alerts", or "track metrics".
managing-environment-configurations
Implement environment and configuration management with comprehensive guidance and automation. Use when you need to work with environment configuration. Trigger with phrases like "manage environments", "configure environments", or "sync configurations".
managing-deployment-rollbacks
Deploy use when you need to work with deployment and CI/CD. This skill provides deployment automation and orchestration with comprehensive guidance and automation. Trigger with phrases like "deploy application", "create pipeline", or "automate deployment".