ml-pipeline

Complete machine learning pipeline for trading: feature engineering, AutoML, deep learning, and financial RL. Use for automated parameter sweeps, feature creation, model training, and anti-leakage validation.

3,891 stars

Best use case

ml-pipeline is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Complete machine learning pipeline for trading: feature engineering, AutoML, deep learning, and financial RL. Use for automated parameter sweeps, feature creation, model training, and anti-leakage validation.

Teams using ml-pipeline should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ml-pipeline/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/ahuserious/ml-pipeline/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ml-pipeline/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ml-pipeline Compares

Feature / Agentml-pipelineStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Complete machine learning pipeline for trading: feature engineering, AutoML, deep learning, and financial RL. Use for automated parameter sweeps, feature creation, model training, and anti-leakage validation.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# ML Pipeline

Unified skill for the complete ML pipeline within a quant trading research system.
Consolidates eight prior skills into a single authoritative reference covering
the full lifecycle: data validation, feature creation, selection,
transformation, anti-leakage checks, pipeline automation, deep learning optimization, and deployment.

---

## 1. When to Use

Activate this skill when the task involves any of the following:

- Creating, selecting, or transforming features for an ML-driven strategy.
- Auditing an existing feature pipeline for data leakage or overfitting risk.
- Automating an end-to-end ML pipeline (data prep through model export).
- Evaluating feature importance, scaling, encoding, or interaction effects.
- Integrating features with a feature store (Feast, Tecton, custom Parquet store).
- Explaining core ML concepts (bias-variance, cross-validation, regularisation)
  in the context of feature engineering decisions.

---

## 2. Inputs to Gather

Before starting work, collect or confirm:

| Input | Details |
|-------|---------|
| **Objective** | Target metric (Sharpe, accuracy, RMSE ...), constraints, time horizon. |
| **Data** | Symbols / instruments, timeframe, bar type, sampling frequency, data sources. |
| **Leakage risks** | Point-in-time concerns, survivorship bias, look-ahead in labels or features. |
| **Compute budget** | CPU/GPU limits, wall-clock budget for AutoML search. |
| **Latency** | Online vs. offline inference, acceptable prediction latency. |
| **Interpretability** | Regulatory or research need for explainable features / models. |
| **Deployment target** | Where the model will run (notebook, backtest harness, live engine). |

---

## 3. Feature Creation Patterns

### 3.1 Numerical Features

- **Interaction terms**: `price * volume`, `high / low`, `close - open`.
- **Rolling statistics**: mean, std, skew, kurtosis over configurable windows.
- **Polynomial / log transforms**: `log(volume + 1)`, `spread^2`.
- **Binning / discretisation**: equal-width, quantile-based, or domain-driven bins.

### 3.2 Categorical Features

- **One-hot encoding**: for low-cardinality categoricals (sector, exchange).
- **Target encoding**: mean-target per category with smoothing (careful of leakage -- use only in-fold means).
- **Ordinal encoding**: when categories have a natural order (credit rating).

### 3.3 Time-Series Specific

- **Lag features**: `return_{t-1}`, `return_{t-5}`, etc.
- **Calendar features**: day-of-week, month, quarter, options-expiry flag.
- **Rolling z-score**: `(x - rolling_mean) / rolling_std` for stationarity.
- **Fractional differentiation**: preserve memory while achieving stationarity (Lopez de Prado).

### 3.4 Feature Selection Techniques

- **Filter methods**: mutual information, variance threshold, correlation pruning.
- **Wrapper methods**: recursive feature elimination (RFE), forward/backward selection.
- **Embedded methods**: L1 regularisation, tree-based importance, SHAP values.
- **Permutation importance**: model-agnostic; run on out-of-fold predictions.

---

## 4. Anti-Leakage Checks

Data leakage is the single most common cause of inflated backtest results.
Apply these checks at every pipeline stage:

### 4.1 Label Leakage

- Labels must be computed from **future** returns relative to the feature
  timestamp. Verify that the label window does not overlap the feature window.
- Use purging and embargo when labels span multiple bars.

### 4.2 Feature Leakage

- No feature may use information from time `t+1` or later at prediction time `t`.
- Rolling statistics must use a **closed** left window: `df['feat'].rolling(20).mean().shift(1)`.
- Target-encoded categoricals must be computed on the **training fold only**.

### 4.3 Cross-Validation Leakage

- Use **purged k-fold** or **walk-forward** CV for time-series. Never use random
  k-fold on ordered data.
- Insert an **embargo gap** between train and test folds to prevent bleed-through
  from autocorrelation.

### 4.4 Survivorship & Selection Bias

- Ensure the universe of instruments at time `t` reflects what was actually
  tradable at that time (delisted stocks, halted symbols removed later).
- Backfill from point-in-time databases where available.

### 4.5 Validation Checklist

Run before every backtest:

```text
[ ] Labels computed strictly from future returns (no overlap with features)
[ ] All rolling features shifted by at least 1 bar
[ ] Target encoding uses in-fold means only
[ ] Walk-forward or purged CV used (no random shuffle on time-series)
[ ] Embargo gap >= max(label_horizon, autocorrelation_lag)
[ ] Universe is point-in-time (no survivorship bias)
[ ] No global scaling fitted on full dataset (fit on train, transform test)
```

---

## 5. Pipeline Automation (AutoML)

### 5.1 Prerequisites

- Python environment with one or more AutoML libraries:
  Auto-sklearn, TPOT, H2O AutoML, PyCaret, Optuna, or custom Optuna pipelines.
- Training data in CSV / Parquet / database.
- Problem type identified: classification, regression, or time-series forecasting.

### 5.2 Pipeline Steps

| Step | Action |
|------|--------|
| **1. Define requirements** | Problem type, evaluation metric, time/resource budget, interpretability needs. |
| **2. Data infrastructure** | Load data, quality assessment, train/val/test split strategy, define feature transforms. |
| **3. Configure AutoML** | Select framework, define algorithm search space, set preprocessing steps, choose tuning strategy (Bayesian, random, Hyperband). |
| **4. Execute training** | Run automated feature engineering, model selection, hyperparameter optimisation, cross-validation. |
| **5. Analyse & export** | Compare models, extract best config, feature importance, visualisations, export for deployment. |

### 5.3 Pipeline Configuration Template

```python
pipeline_config = {
    "task_type": "classification",        # or "regression", "time_series"
    "time_budget_seconds": 3600,
    "algorithms": ["rf", "xgboost", "catboost", "lightgbm"],
    "preprocessing": ["scaling", "encoding", "imputation"],
    "tuning_strategy": "bayesian",        # or "random", "hyperband"
    "cv_folds": 5,
    "cv_type": "purged_kfold",            # or "walk_forward"
    "embargo_bars": 10,
    "early_stopping_rounds": 50,
    "metric": "sharpe_ratio",             # domain-specific metric
}
```

### 5.4 Output Artifacts

- `automl_config.py` -- pipeline configuration.
- `best_model.pkl` / `.joblib` / `.onnx` -- serialised model.
- `feature_pipeline.pkl` -- fitted preprocessing + feature transforms.
- `evaluation_report.json` -- metrics, confusion matrix / residuals, feature rankings.
- `deployment/` -- prediction API code, input validation, requirements.txt.

---

## 6. Core ML Fundamentals (Feature-Engineering Context)

### 6.1 Bias-Variance Trade-off

- More features increase model capacity (lower bias) but risk overfitting (higher variance).
- Use regularisation (L1/L2), feature selection, or dimensionality reduction to manage.

### 6.2 Evaluation Strategy

- **Walk-forward validation**: the gold standard for time-series strategies.
  Roll a fixed-width training window forward; test on the next out-of-sample period.
- **Monte Carlo permutation tests**: shuffle labels and re-evaluate to estimate
  the probability that observed performance is due to chance.
- **Combinatorial purged CV (CPCV)**: generate many train/test combinations with
  purging for more robust performance estimates.

### 6.3 Feature Scaling

- Fit scalers (StandardScaler, MinMaxScaler, RobustScaler) on the **training set only**.
- Apply the same fitted scaler to validation and test sets.
- RobustScaler is often preferred for financial data due to heavy tails.

### 6.4 Handling Missing Data

- Forward-fill then backward-fill for price data (be aware of leakage on backfill).
- Indicator column for missingness can itself be informative.
- Tree-based models can handle NaN natively; linear models cannot.

---

## 7. Workflow

For any feature engineering task, follow this sequence:

1. **Restate** the task in measurable terms (metric, constraints, deadline).
2. **Enumerate** required artifacts: datasets, feature definitions, configs, scripts, reports.
3. **Propose** a default approach and 1-2 alternatives with trade-offs.
4. **Implement** feature pipeline with anti-leakage checks built in.
5. **Validate** with walk-forward CV, Monte Carlo, and the leakage checklist above.
6. **Deliver** repo-ready code, documentation, and a run command.

---

## 8. Deep Learning Optimization

### 8.1 Optimizer Selection

| Optimizer | Best For | Learning Rate |
|-----------|----------|---------------|
| Adam | Most cases, adaptive | 1e-3 to 1e-4 |
| AdamW | Transformers, weight decay | 1e-4 to 1e-5 |
| SGD + Momentum | Large batches, fine-tuning | 1e-2 to 1e-3 |
| RAdam | Stability without warmup | 1e-3 |

### 8.2 Learning Rate Scheduling

- **OneCycleLR**: Best for short training, fast convergence
- **CosineAnnealing**: Smooth decay, good generalization
- **ReduceOnPlateau**: Adaptive when validation loss plateaus
- **Warmup + Decay**: Standard for transformers

### 8.3 Regularization Techniques

- **Dropout**: 0.1-0.5 for fully connected layers
- **L2 (Weight Decay)**: 1e-4 to 1e-2
- **Batch Normalization**: Stabilizes training
- **Early Stopping**: Monitor validation loss, patience 5-10 epochs

### 8.4 PyTorch Lightning Integration

```python
import pytorch_lightning as pl

class TradingModel(pl.LightningModule):
    def configure_optimizers(self):
        optimizer = torch.optim.AdamW(self.parameters(), lr=1e-4)
        scheduler = torch.optim.lr_scheduler.OneCycleLR(
            optimizer, max_lr=1e-3, total_steps=self.trainer.estimated_stepping_batches
        )
        return [optimizer], [scheduler]
```

### 8.5 Financial Reinforcement Learning

- **State**: Market features, portfolio state, position
- **Action**: Buy/Sell/Hold, position sizing
- **Reward**: Risk-adjusted returns (Sharpe, Sortino)
- **Frameworks**: Stable-Baselines3, RLlib, FinRL

---

## 9. Error Handling

| Problem | Cause | Fix |
|---------|-------|-----|
| AutoML search finds no good model | Insufficient time budget or poor features | Increase budget, engineer better features, expand algorithm search space. |
| Out of memory during training | Dataset too large for available RAM | Downsample, use incremental learning, simplify feature engineering. |
| Model accuracy below threshold | Weak signal or overfitting | Collect more data, add domain-driven features, regularise, adjust metric. |
| Feature transforms produce NaN/Inf | Division by zero, log of negative | Add guards: `np.where(denom != 0, ...)`, `np.log1p(np.abs(x))`. |
| Optimiser fails to converge | Bad hyperparameter ranges | Tighten search bounds, increase iterations, exclude unstable algorithms. |

---

## 10. Bundled Scripts

All scripts live in `scripts/` within this skill directory.

| Script | Purpose |
|--------|---------|
| `data_validation.py` | Validate input data quality before pipeline execution. |
| `model_evaluation.py` | Evaluate trained model performance and generate reports. |
| `pipeline_deployment.py` | Deploy a trained pipeline to a target environment with rollback support. |
| `feature_engineering_pipeline.py` | End-to-end feature engineering: load, clean, transform, select, train. |
| `feature_importance_analyzer.py` | Analyse feature importance (permutation, SHAP, tree-based). |
| `data_visualizer.py` | Visualise feature distributions and relationships to target. |
| `feature_store_integration.py` | Integrate with feature stores (Feast, Tecton) for online/offline serving. |

---

## 11. Resources

### Frameworks

- **scikit-learn** -- preprocessing, feature selection, pipelines.
- **Auto-sklearn / TPOT / H2O AutoML / PyCaret** -- automated pipeline search.
- **Optuna** -- flexible hyperparameter optimisation.
- **SHAP** -- model-agnostic feature importance.
- **Feast / Tecton** -- feature store management.
- **PyTorch Lightning** -- https://lightning.ai/docs/pytorch/stable/
- **Stable-Baselines3** -- https://stable-baselines3.readthedocs.io/
- **FinRL** -- https://github.com/AI4Finance-Foundation/FinRL

### Key References

- Lopez de Prado, *Advances in Financial Machine Learning* (2018) -- purged CV, fractional differentiation, meta-labelling.
- Hastie, Tibshirani & Friedman, *The Elements of Statistical Learning* -- bias-variance, regularisation, model selection.
- scikit-learn user guide: feature extraction, preprocessing, model selection.

### Best Practices

- Always start with a simple baseline before running AutoML.
- Balance automation with domain knowledge -- blind search rarely beats informed priors.
- Monitor resource consumption; set hard timeouts.
- Validate on true out-of-sample holdout data, not just cross-validation.
- Document every pipeline decision for reproducibility.

Related Skills

docs-pipeline-automation

3891
from openclaw/skills

Build repeatable data-to-Docs pipelines from Sheets and Drive sources. Use for automated status reports, template-based document assembly, and scheduled publishing workflows.

Workflow & Productivity

pipeline-analytics

3891
from openclaw/skills

Generate interactive analytics dashboards from CRM data. Use when asked to "show pipeline stats", "create a report", "analyze leads", "show conversion rates", "build a dashboard", "visualize outreach data", "funnel analysis", or any data visualization request from DuckDB workspace data.

ci-cd-pipeline-builder

3891
from openclaw/skills

CI/CD Pipeline Builder

lead-gen-pipeline

3891
from openclaw/skills

Automated lead generation pipeline with AI-powered lead scoring and personalized follow-up generation. Score leads 0-100 with reasoning, generate context-aware follow-ups in multiple tones. Integrates with any CRM. Use for sales automation, cold outreach, and pipeline management.

Sales Pipeline Tracker

3891
from openclaw/skills

Track deals through every stage from lead to close. Manage pipeline stages, update deal status, forecast revenue, and identify bottlenecks in your sales process.

---

3891
from openclaw/skills

name: article-factory-wechat

Content & Documentation

humanizer

3891
from openclaw/skills

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.

Content & Documentation

find-skills

3891
from openclaw/skills

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

General Utilities

tavily-search

3891
from openclaw/skills

Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.

Data & Research

baidu-search

3891
from openclaw/skills

Search the web using Baidu AI Search Engine (BDSE). Use for live information, documentation, or research topics.

Data & Research

agent-autonomy-kit

3891
from openclaw/skills

Stop waiting for prompts. Keep working.

Workflow & Productivity

Meeting Prep

3891
from openclaw/skills

Never walk into a meeting unprepared again. Your agent researches all attendees before calendar events—pulling LinkedIn profiles, recent company news, mutual connections, and conversation starters. Generates a briefing doc with talking points, icebreakers, and context so you show up informed and confident. Triggered automatically before meetings or on-demand. Configure research depth, advance timing, and output format. Walking into meetings blind is amateur hour—missed connections, generic small talk, zero leverage. Use when setting up meeting intelligence, researching specific attendees, generating pre-meeting briefs, or automating your prep workflow.

Workflow & Productivity