data-quality-frameworks
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
About this skill
This skill enables AI agents to assist in the design, implementation, and management of comprehensive data quality validation frameworks. It leverages industry-standard tools like Great Expectations for declarative data validation, dbt (data build tool) for testing data transformations, and the establishment of data contracts to define expectations between data producers and consumers. By utilizing this skill, AI agents can guide users through setting up robust data quality pipelines, defining validation rules, monitoring data quality metrics, and automating validation within CI/CD workflows, ensuring data reliability and trustworthiness across an organization's data ecosystem. It provides production patterns and best practices for building high-quality data assets.
Best use case
Assisting in the design and implementation of data quality checks within ETL/ELT pipelines; guiding the setup of Great Expectations validation suites; building comprehensive dbt test suites for data transformations; facilitating the establishment and enforcement of data contracts between data-producing and data-consuming teams; advising on monitoring data quality metrics and automating data validation in CI/CD environments.
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
Users will gain a well-structured and implemented data quality framework, featuring automated validation rules, dbt tests, and clearly defined data contracts. This leads to increased data reliability, reduced data incidents, improved trust in data assets, and more efficient data operations, facilitated by expert guidance from the AI agent.
Practical example
Example input
Set up a data quality framework for my new customer database. I want to use Great Expectations for core validations and dbt tests for my transformation models. How do I define data contracts for the `orders` table?
Example output
To establish a robust data quality framework for your customer database using Great Expectations and dbt, we'll start by defining expectations for your raw data tables. For instance, for your `customers` table, we can set expectations such as '`customer_id` is unique and not null', '`email` matches a regex pattern', and '`signup_date` is a valid date and falls within an expected range'. I can help generate the initial Great Expectations configuration. Next, for your dbt models, we'll implement tests directly within your `schema.yml` files. For your `stg_customers` model, common tests include `unique` and `not_null` for primary keys, `relationships` to other tables, and `accepted_values` for categorical columns like `customer_segment`. For data contracts, especially for your `orders` table, we need to explicitly define schema, data types, uniqueness constraints, nullability, expected value ranges, and update frequency. This contract will serve as an agreement between the team producing the `orders` data and the teams consuming it. Would you like me to draft a preliminary data contract template for the `orders` table, including common attributes like `order_id`, `customer_id`, `order_date`, `total_amount`, and `status`?
When to use this skill
- Implementing data quality checks in pipelines
- Setting up Great Expectations validation
- Building comprehensive dbt test suites
- Establishing data contracts between teams
When not to use this skill
- The SKILL.md provided an incomplete 'Do not use this skill when' section, which currently states: 'The data sources are'
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/data-quality-frameworks/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How data-quality-frameworks Compares
| Feature / Agent | data-quality-frameworks | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | easy | N/A |
Frequently Asked Questions
What does this skill do?
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as easy. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
SKILL.md Source
# Data Quality Frameworks Production patterns for implementing data quality with Great Expectations, dbt tests, and data contracts to ensure reliable data pipelines. ## Use this skill when - Implementing data quality checks in pipelines - Setting up Great Expectations validation - Building comprehensive dbt test suites - Establishing data contracts between teams - Monitoring data quality metrics - Automating data validation in CI/CD ## Do not use this skill when - The data sources are undefined or unavailable - You cannot modify validation rules or schemas - The task is unrelated to data quality or contracts ## Instructions - Identify critical datasets and quality dimensions. - Define expectations/tests and contract rules. - Automate validation in CI/CD and schedule checks. - Set alerting, ownership, and remediation steps. - If detailed patterns are required, open `resources/implementation-playbook.md`. ## Safety - Avoid blocking critical pipelines without a fallback plan. - Handle sensitive data securely in validation outputs. ## Resources - `resources/implementation-playbook.md` for detailed frameworks, templates, and examples.
Related Skills
data-engineer
Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms.
cc-skill-clickhouse-io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Data Engineering Command Center
Complete methodology for designing, building, operating, and scaling data pipelines and infrastructure. Zero dependencies — pure agent skill.
native-data-fetching
Use when implementing or debugging ANY network request, API call, or data fetching. Covers fetch API, React Query, SWR, error handling, caching, offline support, and Expo Router data loaders (useLoaderData).
hugging-face-datasets
Create and manage datasets on Hugging Face Hub. Supports initializing repos, defining configs/system prompts, streaming row updates, and SQL-based dataset querying/transformation. Designed to work alongside HF MCP server for comprehensive dataset workflows.
hugging-face-dataset-viewer
Query Hugging Face datasets through the Dataset Viewer API for splits, rows, search, filters, and parquet links.
gdpr-data-handling
Practical implementation guide for GDPR-compliant data processing, consent management, and privacy controls.
fp-data-transforms
Everyday data transformations using functional patterns - arrays, objects, grouping, aggregation, and null-safe access
food-database-query
Food Database Query
database
Database development and operations workflow covering SQL, NoSQL, database design, migrations, optimization, and data engineering.
database-optimizer
Expert database optimizer specializing in modern performance tuning, query optimization, and scalable architectures.
database-migrations-sql-migrations
SQL database migrations with zero-downtime strategies for PostgreSQL, MySQL, and SQL Server. Focus on data integrity and rollback plans.