Database Sync

Automate database synchronization, replication, migration, and cross-platform data integration

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

Database Sync is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Automate database synchronization, replication, migration, and cross-platform data integration

Teams using Database Sync should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/database-sync/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/database-sync/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/database-sync/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Database Sync Compares

Feature / Agent	Database Sync	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Automate database synchronization, replication, migration, and cross-platform data integration

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Database Sync

Comprehensive skill for database synchronization, replication, and data integration.

## Core Architecture

### Sync Patterns

```
DATABASE SYNC PATTERNS:
┌─────────────────────────────────────────────────────────┐
│                 ONE-WAY REPLICATION                      │
│  ┌──────────┐         ┌──────────┐                      │
│  │  Master  │ ──────▶ │  Replica │                      │
│  └──────────┘         └──────────┘                      │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│                 BI-DIRECTIONAL SYNC                      │
│  ┌──────────┐         ┌──────────┐                      │
│  │ Database │ ◀─────▶ │ Database │                      │
│  │    A     │         │    B     │                      │
│  └──────────┘         └──────────┘                      │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│                  HUB-AND-SPOKE                           │
│           ┌──────────┐                                   │
│           │  Spoke 1 │                                   │
│           └────┬─────┘                                   │
│                │                                         │
│  ┌──────────┐──┴──┌──────────┐                          │
│  │  Spoke 2 │◀───▶│   Hub    │◀────┬──────────┐        │
│  └──────────┘     └──────────┘     │  Spoke 3 │        │
│                                     └──────────┘        │
└─────────────────────────────────────────────────────────┘
```

### Sync Methods

```yaml
sync_methods:
  full_sync:
    description: "Complete data refresh"
    use_when:
      - Initial sync
      - Schema changes
      - Disaster recovery
    considerations:
      - Downtime required
      - Resource intensive
      
  incremental_sync:
    description: "Changes only"
    tracking_methods:
      - timestamps (updated_at)
      - change_data_capture (CDC)
      - triggers
      - log_based
    advantages:
      - Minimal data transfer
      - Near real-time
      
  snapshot_sync:
    description: "Point-in-time copy"
    use_when:
      - Analytics
      - Reporting
      - Backup
```

## Configuration

### Source/Target Setup

```yaml
sync_config:
  source:
    type: postgresql
    host: "source-db.example.com"
    port: 5432
    database: "production"
    credentials:
      type: secret_manager
      path: "db/source/credentials"
    ssl: required
    
  target:
    type: mysql
    host: "target-db.example.com"
    port: 3306
    database: "analytics"
    credentials:
      type: secret_manager
      path: "db/target/credentials"
    ssl: required
    
  sync_settings:
    mode: incremental
    batch_size: 10000
    parallel_tables: 4
    retry_attempts: 3
    checkpoint_interval: 5_minutes
```

### Table Mapping

```yaml
table_mappings:
  - source_table: users
    target_table: dim_users
    columns:
      id: user_id
      email: email_address
      created_at: registration_date
      status: user_status
    transformations:
      - column: status
        transform: "UPPER(status)"
      - column: email_address
        transform: "LOWER(email)"
    filters:
      - "status != 'deleted'"
      - "created_at > '2023-01-01'"
    
  - source_table: orders
    target_table: fact_orders
    columns:
      "*": "*"  # All columns
    exclude_columns:
      - internal_notes
      - deleted_at
    incremental_key: updated_at
```

## Change Data Capture

### CDC Configuration

```yaml
cdc_config:
  method: logical_replication  # or: trigger, polling
  
  postgresql:
    publication: "sync_publication"
    slot: "sync_slot"
    tables:
      - users
      - orders
      - products
      
  change_tracking:
    capture_deletes: true
    capture_before_values: true
    
  output_format:
    type: json
    include:
      - operation
      - timestamp
      - table
      - key
      - before
      - after
```

### CDC Event Processing

```yaml
cdc_events:
  example_insert:
    operation: INSERT
    timestamp: "2024-01-15T10:30:00Z"
    table: users
    key: { id: 12345 }
    after:
      id: 12345
      email: "user@example.com"
      status: "active"
      
  example_update:
    operation: UPDATE
    timestamp: "2024-01-15T10:31:00Z"
    table: users
    key: { id: 12345 }
    before:
      status: "active"
    after:
      status: "premium"
      
  example_delete:
    operation: DELETE
    timestamp: "2024-01-15T10:32:00Z"
    table: users
    key: { id: 12345 }
    before:
      id: 12345
      email: "user@example.com"
```

## Conflict Resolution

### Conflict Strategies

```yaml
conflict_resolution:
  strategies:
    - name: last_write_wins
      description: "Most recent update wins"
      resolution: |
        IF source.updated_at > target.updated_at
        THEN use source
        ELSE keep target
        
    - name: source_priority
      description: "Source always wins"
      resolution: "always use source"
      
    - name: merge
      description: "Merge non-conflicting fields"
      resolution: |
        FOR each field:
          IF only_one_changed: use_changed
          IF both_changed: use source.field
          
    - name: custom_rules
      description: "Field-specific rules"
      rules:
        - field: quantity
          strategy: sum
        - field: status
          strategy: priority_order
          order: ["active", "pending", "inactive"]
        - field: last_login
          strategy: max
```

### Conflict Logging

```yaml
conflict_log:
  format:
    timestamp: "{{time}}"
    table: "{{table}}"
    key: "{{primary_key}}"
    field: "{{conflicting_field}}"
    source_value: "{{source.value}}"
    target_value: "{{target.value}}"
    resolution: "{{applied_strategy}}"
    result: "{{final_value}}"
    
  storage:
    type: table
    name: sync_conflicts
    retention_days: 90
    
  alerting:
    threshold: 100  # conflicts per hour
    notify: ["slack:#data-alerts"]
```

## Schema Management

### Schema Sync

```yaml
schema_sync:
  mode: evolve  # or: strict, ignore
  
  operations:
    add_column:
      action: apply
      default_value: null
      
    remove_column:
      action: warn
      keep_data: true
      
    modify_type:
      action: review
      safe_changes:
        - varchar_expand
        - int_to_bigint
        
    rename_column:
      action: manual
      create_mapping: true
```

### Migration Scripts

```sql
-- Example Migration: Add new column
ALTER TABLE users 
ADD COLUMN IF NOT EXISTS 
  loyalty_tier VARCHAR(20) DEFAULT 'bronze';

-- Example Migration: Create sync tracking table
CREATE TABLE IF NOT EXISTS _sync_metadata (
  table_name VARCHAR(100) PRIMARY KEY,
  last_sync_at TIMESTAMP,
  last_sync_key VARCHAR(255),
  records_synced BIGINT,
  status VARCHAR(20)
);

-- Example Migration: Add sync trigger
CREATE OR REPLACE FUNCTION track_changes()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO _change_log (
    table_name, operation, key, changed_at
  ) VALUES (
    TG_TABLE_NAME, TG_OP, NEW.id, NOW()
  );
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;
```

## Monitoring Dashboard

### Sync Status

```
DATABASE SYNC STATUS
═══════════════════════════════════════

OVERALL STATUS: ✓ Healthy

SOURCE: PostgreSQL (production)
TARGET: MySQL (analytics)
MODE:   Incremental CDC

TABLES:
┌──────────────┬──────────┬───────────┬──────────┐
│ Table        │ Status   │ Lag       │ Records  │
├──────────────┼──────────┼───────────┼──────────┤
│ users        │ ✓ Synced │ 2s        │ 1.2M     │
│ orders       │ ✓ Synced │ 5s        │ 8.5M     │
│ products     │ ✓ Synced │ 1s        │ 50K      │
│ events       │ ⚠ Behind │ 2m 30s    │ 45M      │
└──────────────┴──────────┴───────────┴──────────┘

THROUGHPUT:
Current:  5,230 records/sec
Average:  4,850 records/sec
Peak:     12,400 records/sec

LAST 24 HOURS:
Records Synced: 45.2M
Errors:         23
Conflicts:      156
```

### Metrics

```yaml
metrics:
  - name: sync_lag_seconds
    type: gauge
    labels: [table_name, sync_job]
    alert:
      warning: "> 60"
      critical: "> 300"
      
  - name: records_synced_total
    type: counter
    labels: [table_name, operation]
    
  - name: sync_errors_total
    type: counter
    labels: [table_name, error_type]
    
  - name: conflict_count
    type: counter
    labels: [table_name, resolution_strategy]
```

## Integration Examples

### PostgreSQL to BigQuery

```yaml
pg_to_bigquery:
  source:
    type: postgresql
    connection: "${PG_CONNECTION_STRING}"
    tables:
      - name: orders
        incremental_key: updated_at
        
  target:
    type: bigquery
    project: "my-project"
    dataset: "analytics"
    
  schedule: "*/5 * * * *"  # Every 5 minutes
  
  transform:
    - type: add_metadata
      columns:
        _synced_at: "CURRENT_TIMESTAMP()"
        _source: "'production'"
```

### MySQL to Elasticsearch

```yaml
mysql_to_elasticsearch:
  source:
    type: mysql
    tables:
      - products
      
  target:
    type: elasticsearch
    index: products_search
    
  mapping:
    id: _id
    name:
      type: text
      analyzer: standard
    description:
      type: text
      analyzer: english
    category:
      type: keyword
    price:
      type: float
```

## Best Practices

1. **Test Thoroughly**: Validate sync accuracy
2. **Monitor Lag**: Alert on replication delay
3. **Handle Conflicts**: Define clear resolution rules
4. **Backup Before Migration**: Protect data
5. **Use Incremental**: Minimize load
6. **Log Everything**: Maintain audit trail
7. **Plan for Failures**: Implement retry logic
8. **Schema Evolution**: Handle changes gracefully

Related Skills

docker-database

from diegosouzapw/awesome-omni-skill

Configure database containers with security, persistence, and health checks

database-skill

from diegosouzapw/awesome-omni-skill

Design and manage relational databases including table creation, migrations, and schema design. Use for database modeling and maintenance.

database-architect

from diegosouzapw/awesome-omni-skill

Database design and optimization specialist. Schema design, query optimization, indexing strategies, data modeling, and migration planning for relational and NoSQL databases.

cursor-rules-synchronizer

from diegosouzapw/awesome-omni-skill

Synchronizes Cursor Rules (.mdc files in .cursor/rules/) to CLAUDE.md by generating a Rules section with context-efficient descriptions and usage instructions. Use when setting up Cursor Rules for the first time, after adding or modifying rules, or when the Rules section in CLAUDE.md is missing or outdated.

asynchronous-programming-preference

from diegosouzapw/awesome-omni-skill

Favors the use of async and await for asynchronous programming in Python.

async-operations

from diegosouzapw/awesome-omni-skill

Specifies the preferred syntax for asynchronous operations using async/await and onMount for component initialization. This results in cleaner and more readable asynchronous code.

arch-database

from diegosouzapw/awesome-omni-skill

DB architecture: relational vs document vs graph vs vector, schema design, indexing, replication, sharding

acsets-algebraic-databases

from diegosouzapw/awesome-omni-skill

ACSets (Attributed C-Sets): Algebraic databases as in-memory data structures. Category-theoretic formalism for relational databases generalizing graphs and data frames.

async-repl-protocol

from diegosouzapw/awesome-omni-skill

Async REPL Protocol

vercel-kv-database-rules

from diegosouzapw/awesome-omni-skill

Defines how to interact with Vercel's KV database for storing and retrieving session and application data.

Validate with Database

from diegosouzapw/awesome-omni-skill

Connect to live PostgreSQL database to validate schema assumptions, compare pg_dump vs pgschema output, and query system catalogs interactively

sqlmap-database-pentesting

from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns...