ml-pipeline-workflow
Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
About this skill
This skill empowers AI agents to provide comprehensive guidance and orchestration for Machine Learning Operations (MLOps) pipelines. It covers the entire lifecycle, beginning with data preparation and extending through model training, evaluation, deployment, and ongoing monitoring. By leveraging this skill, agents can clarify project goals, apply MLOps best practices, provide actionable steps, and ensure robust validation at each stage, streamlining the development and deployment of production-ready machine learning models. It emphasizes a structured approach to building scalable and maintainable ML workflows.
Best use case
Designing and implementing MLOps pipelines from scratch for new machine learning projects. Automating stages of an ML workflow, such as data ingestion, model training, and deployment. Standardizing MLOps practices and ensuring best-practice adherence within an organization. Troubleshooting and optimizing existing ML pipelines for performance and reliability. Guiding users through the complex process of deploying ML models into production environments.
Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
A clear, structured plan for an MLOps pipeline tailored to specific project goals and constraints. Actionable steps for data preparation, model training, evaluation, deployment, and monitoring. Identification and application of MLOps best practices and robust validation methods. Guidance on potential pitfalls and verification steps for each stage of the ML lifecycle. If detailed examples are required, access to relevant resources like an implementation playbook (`resources/implementation-playbook.md`).
Practical example
Example input
Help me design an MLOps pipeline for deploying a fraud detection model. I have historical transaction data and need to ensure continuous retraining and monitoring.
Example output
Okay, I can help you design an MLOps pipeline for your fraud detection model with continuous retraining and monitoring.
Let's clarify your goals and constraints:
1. **Data Sources:** Where is your historical transaction data stored? Are there real-time data streams for inference?
2. **Model Requirements:** What type of model are you considering? Are there latency requirements for fraud detection?
3. **Deployment Environment:** Where will the model be deployed (e.g., cloud platform like AWS, Azure, GCP; on-premise)?
4. **Monitoring Needs:** What metrics are crucial for monitoring (e.g., model drift, performance degradation, data quality)?
5. **Retraining Triggers:** What events should trigger model retraining (e.g., data drift, performance drop, time-based)?
Based on best practices, an ideal pipeline would involve these stages:
**1. Data Ingestion & Validation:**
* **Action:** Set up automated pipelines to ingest historical and streaming transaction data. Implement data schema validation and quality checks.
* **Verification:** Ensure data integrity, completeness, and consistency before further processing.
**2. Data Preprocessing & Feature Engineering:**
* **Action:** Develop robust ETL processes to clean, transform, and create features from raw data.
* **Verification:** Validate feature distributions and ensure reproducibility across training and inference.
**3. Model Training & Evaluation:**
* **Action:** Implement a reproducible training workflow. Experiment tracking for hyperparameter tuning and model versioning. Evaluate models using appropriate metrics (e.g., precision, recall, F1-score for fraud detection).
* **Verification:** Cross-validation results, model performance reports, and comparison against a baseline.
**4. Model Registering & Versioning:**
* **Action:** Store trained models in a model registry with versioning, metadata, and associated metrics.
* **Verification:** Models are traceable and associated with their training data and code.
**5. Model Deployment:**
* **Action:** Deploy the best-performing model as a low-latency API endpoint. Consider canary deployments or A/B testing if feasible.
* **Verification:** Endpoint availability, response times, and initial inference accuracy checks.
**6. Model Monitoring & Alerting:**
* **Action:** Set up continuous monitoring for model performance (e.g., accuracy, latency), data drift, and concept drift. Configure alerts for deviations.
* **Verification:** Dashboards displaying real-time model health and automated notifications for issues.
**7. Continuous Retraining & CI/CD:**
* **Action:** Automate the retraining pipeline based on predefined triggers (e.g., data drift detection, scheduled intervals). Integrate the entire workflow into a CI/CD system for automated testing and deployment.
* **Verification:** Regular retraining cycles, updated models in production, and smooth deployment of pipeline changes.
To provide more detailed examples or an implementation playbook, I can open `resources/implementation-playbook.md` if you'd like to dive into specific technologies or deeper technical guidance.When to use this skill
- When you need an AI agent to help design or optimize an end-to-end MLOps pipeline.
- To get expert guidance on best practices for data preparation, model training, evaluation, and deployment.
- When seeking actionable steps and verification methods for MLOps tasks.
- If you require a structured approach to deploying and managing ML models in production.
When not to use this skill
- The task is unrelated to ML pipeline workflow or Machine Learning Operations.
- You need a different domain or tool outside this scope, such as general software development, data visualization, or non-ML specific data analysis.
- You are looking for direct code execution or API calls without requiring high-level orchestration or guidance from the agent.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ml-pipeline-workflow/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How ml-pipeline-workflow Compares
| Feature / Agent | ml-pipeline-workflow | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | easy | N/A |
Frequently Asked Questions
What does this skill do?
Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as easy. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
Top AI Agents for Productivity
See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.
SKILL.md Source
# ML Pipeline Workflow
Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
## Do not use this skill when
- The task is unrelated to ml pipeline workflow
- You need a different domain or tool outside this scope
## Instructions
- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.
## Overview
This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring.
## Use this skill when
- Building new ML pipelines from scratch
- Designing workflow orchestration for ML systems
- Implementing data → model → deployment automation
- Setting up reproducible training workflows
- Creating DAG-based ML orchestration
- Integrating ML components into production systems
## What This Skill Provides
### Core Capabilities
1. **Pipeline Architecture**
- End-to-end workflow design
- DAG orchestration patterns (Airflow, Dagster, Kubeflow)
- Component dependencies and data flow
- Error handling and retry strategies
2. **Data Preparation**
- Data validation and quality checks
- Feature engineering pipelines
- Data versioning and lineage
- Train/validation/test splitting strategies
3. **Model Training**
- Training job orchestration
- Hyperparameter management
- Experiment tracking integration
- Distributed training patterns
4. **Model Validation**
- Validation frameworks and metrics
- A/B testing infrastructure
- Performance regression detection
- Model comparison workflows
5. **Deployment Automation**
- Model serving patterns
- Canary deployments
- Blue-green deployment strategies
- Rollback mechanisms
### Reference Documentation
See the `references/` directory for detailed guides:
- **data-preparation.md** - Data cleaning, validation, and feature engineering
- **model-training.md** - Training workflows and best practices
- **model-validation.md** - Validation strategies and metrics
- **model-deployment.md** - Deployment patterns and serving architectures
### Assets and Templates
The `assets/` directory contains:
- **pipeline-dag.yaml.template** - DAG template for workflow orchestration
- **training-config.yaml** - Training configuration template
- **validation-checklist.md** - Pre-deployment validation checklist
## Usage Patterns
### Basic Pipeline Setup
```python
# 1. Define pipeline stages
stages = [
"data_ingestion",
"data_validation",
"feature_engineering",
"model_training",
"model_validation",
"model_deployment"
]
# 2. Configure dependencies
# See assets/pipeline-dag.yaml.template for full example
```
### Production Workflow
1. **Data Preparation Phase**
- Ingest raw data from sources
- Run data quality checks
- Apply feature transformations
- Version processed datasets
2. **Training Phase**
- Load versioned training data
- Execute training jobs
- Track experiments and metrics
- Save trained models
3. **Validation Phase**
- Run validation test suite
- Compare against baseline
- Generate performance reports
- Approve for deployment
4. **Deployment Phase**
- Package model artifacts
- Deploy to serving infrastructure
- Configure monitoring
- Validate production traffic
## Best Practices
### Pipeline Design
- **Modularity**: Each stage should be independently testable
- **Idempotency**: Re-running stages should be safe
- **Observability**: Log metrics at every stage
- **Versioning**: Track data, code, and model versions
- **Failure Handling**: Implement retry logic and alerting
### Data Management
- Use data validation libraries (Great Expectations, TFX)
- Version datasets with DVC or similar tools
- Document feature engineering transformations
- Maintain data lineage tracking
### Model Operations
- Separate training and serving infrastructure
- Use model registries (MLflow, Weights & Biases)
- Implement gradual rollouts for new models
- Monitor model performance drift
- Maintain rollback capabilities
### Deployment Strategies
- Start with shadow deployments
- Use canary releases for validation
- Implement A/B testing infrastructure
- Set up automated rollback triggers
- Monitor latency and throughput
## Integration Points
### Orchestration Tools
- **Apache Airflow**: DAG-based workflow orchestration
- **Dagster**: Asset-based pipeline orchestration
- **Kubeflow Pipelines**: Kubernetes-native ML workflows
- **Prefect**: Modern dataflow automation
### Experiment Tracking
- MLflow for experiment tracking and model registry
- Weights & Biases for visualization and collaboration
- TensorBoard for training metrics
### Deployment Platforms
- AWS SageMaker for managed ML infrastructure
- Google Vertex AI for GCP deployments
- Azure ML for Azure cloud
- Kubernetes + KServe for cloud-agnostic serving
## Progressive Disclosure
Start with the basics and gradually add complexity:
1. **Level 1**: Simple linear pipeline (data → train → deploy)
2. **Level 2**: Add validation and monitoring stages
3. **Level 3**: Implement hyperparameter tuning
4. **Level 4**: Add A/B testing and gradual rollouts
5. **Level 5**: Multi-model pipelines with ensemble strategies
## Common Patterns
### Batch Training Pipeline
```yaml
# See assets/pipeline-dag.yaml.template
stages:
- name: data_preparation
dependencies: []
- name: model_training
dependencies: [data_preparation]
- name: model_evaluation
dependencies: [model_training]
- name: model_deployment
dependencies: [model_evaluation]
```
### Real-time Feature Pipeline
```python
# Stream processing for real-time features
# Combined with batch training
# See references/data-preparation.md
```
### Continuous Training
```python
# Automated retraining on schedule
# Triggered by data drift detection
# See references/model-training.md
```
## Troubleshooting
### Common Issues
- **Pipeline failures**: Check dependencies and data availability
- **Training instability**: Review hyperparameters and data quality
- **Deployment issues**: Validate model artifacts and serving config
- **Performance degradation**: Monitor data drift and model metrics
### Debugging Steps
1. Check pipeline logs for each stage
2. Validate input/output data at boundaries
3. Test components in isolation
4. Review experiment tracking metrics
5. Inspect model artifacts and metadata
## Next Steps
After setting up your pipeline:
1. Explore **hyperparameter-tuning** skill for optimization
2. Learn **experiment-tracking-setup** for MLflow/W&B
3. Review **model-deployment-patterns** for serving strategies
4. Implement monitoring with observability tools
## Related Skills
- **experiment-tracking-setup**: MLflow and Weights & Biases integration
- **hyperparameter-tuning**: Automated hyperparameter optimization
- **model-deployment-patterns**: Advanced deployment strategiesRelated Skills
machine-learning-ops-ml-pipeline
Design and implement a complete ML pipeline for: $ARGUMENTS
mlops-engineer
Build comprehensive ML pipelines, experiment tracking, and model registries with MLflow, Kubeflow, and modern MLOps tools.
n8n-workflow-patterns
Proven architectural patterns for building n8n workflows.
git-pr-workflows-pr-enhance
You are a PR optimization expert specializing in creating high-quality pull requests that facilitate efficient code reviews. Generate comprehensive PR descriptions, automate review processes, and ensu
git-advanced-workflows
Master advanced Git techniques to maintain clean history, collaborate effectively, and recover from any situation with confidence.
fal-workflow
Generate workflow JSON files for chaining AI models
expo-cicd-workflows
Helps understand and write EAS workflow YAML files for Expo projects. Use this skill when the user asks about CI/CD or workflows in an Expo or EAS context, mentions .eas/workflows/, or wants help with EAS build pipelines or deployment automation.
deployment-pipeline-design
Architecture patterns for multi-stage CI/CD pipelines with approval gates and deployment strategies.
data-engineering-data-pipeline
You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.
cicd-automation-workflow-automate
You are a workflow automation expert specializing in creating efficient CI/CD pipelines, GitHub Actions workflows, and automated development processes. Design and implement automation that reduces manual work, improves consistency, and accelerates delivery while maintaining quality and security.
nft-standards
Master ERC-721 and ERC-1155 NFT standards, metadata best practices, and advanced NFT features.
nextjs-app-router-patterns
Comprehensive patterns for Next.js 14+ App Router architecture, Server Components, and modern full-stack React development.