validating-database-integrity
Process use when you need to ensure database integrity through comprehensive data validation. This skill validates data types, ranges, formats, referential integrity, and business rules. Trigger with phrases like "validate database data", "implement data validation rules", "enforce data integrity constraints", or "validate data formats".
Best use case
validating-database-integrity is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Process use when you need to ensure database integrity through comprehensive data validation. This skill validates data types, ranges, formats, referential integrity, and business rules. Trigger with phrases like "validate database data", "implement data validation rules", "enforce data integrity constraints", or "validate data formats".
Teams using validating-database-integrity should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/validating-database-integrity/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How validating-database-integrity Compares
| Feature / Agent | validating-database-integrity | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Process use when you need to ensure database integrity through comprehensive data validation. This skill validates data types, ranges, formats, referential integrity, and business rules. Trigger with phrases like "validate database data", "implement data validation rules", "enforce data integrity constraints", or "validate data formats".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Data Validation Engine
## Overview
Implement and enforce data integrity rules at the database level using CHECK constraints, triggers, foreign keys, and custom validation functions across PostgreSQL and MySQL.
## Prerequisites
- Database credentials with ALTER TABLE and CREATE FUNCTION permissions
- `psql` or `mysql` CLI for executing validation queries
- Current schema documentation or access to `information_schema` for column specifications
- Business rules document describing valid data ranges, formats, and relationships
- Backup of production data before applying new constraints (constraints may reject existing invalid data)
## Instructions
1. Audit existing data quality by running validation queries before adding constraints. Check for NULL values in columns that should be required: `SELECT column_name, COUNT(*) FILTER (WHERE column_name IS NULL) AS null_count, COUNT(*) AS total FROM table_name GROUP BY column_name`.
2. Detect orphaned records (broken referential integrity): `SELECT c.id FROM child_table c LEFT JOIN parent_table p ON c.parent_id = p.id WHERE p.id IS NULL`. Document all orphaned records for cleanup or archival before adding foreign key constraints.
3. Validate data format compliance:
- Email format: `SELECT email FROM users WHERE email !~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'`
- Phone format: `SELECT phone FROM contacts WHERE phone !~ '^\+?[1-9]\d{6,14}$'`
- URL format: `SELECT url FROM links WHERE url !~ '^https?://.+'`
- Date ranges: `SELECT * FROM events WHERE start_date > end_date`
4. Check numeric range violations: `SELECT * FROM products WHERE price < 0 OR price > 999999.99` and `SELECT * FROM users WHERE age < 0 OR age > 150`. Map each column to its valid range based on business rules.
5. Identify duplicate records that violate intended uniqueness: `SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1`. Determine which duplicate to keep (most recent, most complete) and plan deduplication.
6. Generate CHECK constraints for validated rules:
- `ALTER TABLE products ADD CONSTRAINT chk_price_positive CHECK (price >= 0)`
- `ALTER TABLE users ADD CONSTRAINT chk_email_format CHECK (email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$')`
- `ALTER TABLE events ADD CONSTRAINT chk_date_order CHECK (start_date <= end_date)`
- `ALTER TABLE orders ADD CONSTRAINT chk_status_valid CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled'))`
7. Create foreign key constraints with appropriate cascade behavior:
- `ALTER TABLE orders ADD CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE RESTRICT`
- Use `ON DELETE CASCADE` for dependent data (order_items when order is deleted)
- Use `ON DELETE SET NULL` for optional relationships (assigned_to when user is deactivated)
8. Implement complex business rule validation using database triggers when CHECK constraints are insufficient:
- Trigger that prevents order total from exceeding customer credit limit
- Trigger that enforces at least one admin user per organization
- Trigger that validates JSON schema for JSONB columns
9. Apply constraints in a safe two-phase approach:
- Phase 1: Run validation queries to find all violations. Generate data cleanup scripts. Execute cleanup.
- Phase 2: Apply constraints with `NOT VALID` option (PostgreSQL): `ALTER TABLE users ADD CONSTRAINT chk_email CHECK (email ~ '...') NOT VALID` then `ALTER TABLE users VALIDATE CONSTRAINT chk_email` (validates existing data without blocking writes).
10. Generate a data quality report summarizing: total records per table, violation counts by constraint type, cleanup actions taken, constraints applied, and remaining data quality issues requiring manual review.
## Output
- **Data quality audit report** with violation counts, examples, and severity ratings
- **Data cleanup scripts** (SQL) to fix violations before constraint application
- **Constraint DDL scripts** with CHECK, FOREIGN KEY, NOT NULL, and UNIQUE constraints
- **Validation triggers** for complex business rules beyond simple constraints
- **Ongoing validation queries** for periodic data quality monitoring
## Error Handling
| Error | Cause | Solution |
|-------|-------|---------|
| `check constraint violated by existing row` | Existing data fails the new constraint | Run the validation query first to find violations; clean up data; use `NOT VALID` option to add constraint without checking existing data, then validate separately |
| `cannot add foreign key: referenced row not found` | Orphaned child records reference non-existent parent | Clean up orphaned records first with DELETE or UPDATE to valid parent; or insert missing parent records |
| `column cannot be made NOT NULL: contains NULL values` | Existing rows have NULL in the target column | Backfill NULLs with `UPDATE table SET column = default_value WHERE column IS NULL` before adding NOT NULL |
| Trigger function causes performance regression | Complex validation logic executes on every INSERT/UPDATE | Optimize trigger function; use WHEN clause to limit trigger firing; consider CHECK constraints instead of triggers for simple rules |
| Circular foreign key prevents constraint creation | Tables reference each other, preventing creation order | Use `ALTER TABLE ADD CONSTRAINT` after both tables exist; or use `DEFERRABLE INITIALLY DEFERRED` constraints |
## Examples
**Auditing a legacy database with 50,000 invalid email addresses**: Validation query reveals 50,000 of 2M user records have invalid email formats (missing @, double dots, spaces). A cleanup script normalizes common issues (trim whitespace, lowercase) and flags 3,000 unfixable records for manual review. After cleanup, a CHECK constraint with regex validation is applied.
**Enforcing referential integrity on a database without foreign keys**: An application relied on application-level FK enforcement, resulting in 12,000 orphaned order_items, 800 orphaned payments, and 200 orphaned reviews. Cleanup scripts archive orphaned records to backup tables, then foreign key constraints with `ON DELETE CASCADE` are added. A nightly validation job monitors for new orphans.
**Implementing business rules for a financial application**: Constraints enforce: account balance cannot be negative (`CHECK (balance >= 0)`), transfer amount must be positive (`CHECK (amount > 0)`), transaction date cannot be in the future (`CHECK (transaction_date <= CURRENT_DATE)`), and a trigger prevents transfers between accounts owned by different customers unless explicitly authorized.
## Resources
- PostgreSQL CHECK constraints: https://www.postgresql.org/docs/current/ddl-constraints.html
- PostgreSQL triggers: https://www.postgresql.org/docs/current/triggers.html
- MySQL CHECK constraints (8.0.16+): https://dev.mysql.com/doc/refman/8.0/en/create-table-check-constraints.html
- Data validation patterns: https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS
- NOT VALID constraint option: https://www.postgresql.org/docs/current/sql-altertable.htmlRelated Skills
validating-performance-budgets
Validate application performance against defined budgets to identify regressions early. Use when checking page load times, bundle sizes, or API response times against thresholds. Trigger with phrases like "validate performance budget", "check performance metrics", or "detect performance regression".
validating-api-schemas
Validate API schemas against OpenAPI, JSON Schema, and GraphQL specifications. Use when validating API schemas and contracts. Trigger with phrases like "validate API schema", "check OpenAPI spec", or "verify schema".
validating-api-responses
Validate API responses against schemas to ensure contract compliance and data integrity. Use when ensuring API response correctness. Trigger with phrases like "validate responses", "check API responses", or "verify response format".
validating-ai-ethics-and-fairness
Validate AI/ML models and datasets for bias, fairness, and ethical concerns. Use when auditing AI systems for ethical compliance, fairness assessment, or bias detection. Trigger with phrases like "evaluate model fairness", "check for bias", or "validate AI ethics".
scanning-database-security
Process use when you need to work with security and compliance. This skill provides security scanning and vulnerability detection with comprehensive guidance and automation. Trigger with phrases like "scan for vulnerabilities", "implement security controls", or "audit security".
validating-pci-dss-compliance
This skill uses the pci-dss-validator plugin to assess codebases and infrastructure configurations for compliance with the Payment Card Industry Data Security Standard (PCI DSS). It identifies potential vulnerabilities and deviations from PCI DSS requirements. Use this skill when the user requests to "validate PCI compliance", "check PCI DSS", "assess PCI security", or "review PCI standards" for a given project or configuration. It helps ensure that systems handling cardholder data meet the necessary security controls.
optimizing-database-connection-pooling
Process use when you need to work with connection management. This skill provides connection pooling and management with comprehensive guidance and automation. Trigger with phrases like "manage connections", "configure pooling", or "optimize connection usage".
monitoring-database-transactions
Monitor use when you need to work with monitoring and observability. This skill provides health monitoring and alerting with comprehensive guidance and automation. Trigger with phrases like "monitor system health", "set up alerts", or "track metrics".
monitoring-database-health
Monitor use when you need to work with monitoring and observability. This skill provides health monitoring and alerting with comprehensive guidance and automation. Trigger with phrases like "monitor system health", "set up alerts", or "track metrics".
managing-database-sharding
Process use when you need to work with database sharding. This skill provides horizontal sharding strategies with comprehensive guidance and automation. Trigger with phrases like "implement sharding", "shard database", or "distribute data".
managing-database-replication
Process use when you need to work with database scalability. This skill provides replication and sharding with comprehensive guidance and automation. Trigger with phrases like "set up replication", "implement sharding", or "scale database".
managing-database-recovery
Process use when you need to work with database operations. This skill provides database management and optimization with comprehensive guidance and automation. Trigger with phrases like "manage database", "optimize database", or "configure database".