signal-classification

ML trading signal classifiers using XGBoost and LightGBM with walk-forward validation, SHAP feature importance, and threshold optimization

7 stars

byagiprolabs

View on GitHub Installation ↓

Best use case

signal-classification is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

ML trading signal classifiers using XGBoost and LightGBM with walk-forward validation, SHAP feature importance, and threshold optimization

Teams using signal-classification should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/signal-classification/SKILL.md --create-dirs "https://raw.githubusercontent.com/agiprolabs/claude-trading-skills/main/skills/signal-classification/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/signal-classification/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How signal-classification Compares

Feature / Agent	signal-classification	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

ML trading signal classifiers using XGBoost and LightGBM with walk-forward validation, SHAP feature importance, and threshold optimization

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Signal Classification

Predict whether an asset's price will move up or down over a forward horizon using supervised machine learning classifiers. This skill covers the full pipeline: label creation, model training, walk-forward validation, feature importance analysis, and threshold optimization for trading applications.

## Why Tree-Based Models Dominate Trading ML

XGBoost and LightGBM are the workhorses of quantitative trading ML for good reason:

- **Non-linear relationships**: Financial features interact in complex, non-linear ways that trees capture naturally
- **Robust to feature scale**: No need to normalize or standardize inputs — trees split on rank order
- **Built-in feature importance**: Understand which features drive predictions without separate analysis
- **Fast training and inference**: Train on thousands of samples in seconds, predict in microseconds
- **Handle missing values**: Native support for NaN without imputation hacks
- **Regularization built in**: max_depth, min_child_weight, subsample all prevent overfitting

Linear models and deep learning have their place, but for tabular trading features with fewer than 100k samples, gradient-boosted trees consistently outperform alternatives.

## Classification Types

### Binary Classification

The simplest and most common setup. Predict whether forward returns exceed a threshold:

- **Up signal**: forward return > +1%
- **Down signal**: forward return < -1%
- **Neutral (excluded)**: -1% to +1% — drop these from training to create cleaner labels

```python
import numpy as np

def create_binary_labels(
    prices: np.ndarray, horizon: int = 24, threshold: float = 0.01
) -> np.ndarray:
    """Create binary labels from forward returns.

    Args:
        prices: Array of prices.
        horizon: Forward return lookback in bars.
        threshold: Minimum return magnitude for a label.

    Returns:
        Array of labels: 1 (up), 0 (down), NaN (neutral).
    """
    fwd_returns = np.roll(prices, -horizon) / prices - 1
    fwd_returns[-horizon:] = np.nan
    labels = np.where(fwd_returns > threshold, 1,
             np.where(fwd_returns < -threshold, 0, np.nan))
    return labels
```

### Multi-Class Classification

Three classes for finer signal granularity:

| Class | Condition | Typical threshold |
|-------|-----------|-------------------|
| Strong Up | fwd_return > +2% | High confidence long |
| Mild Up | +0.5% to +2% | Moderate confidence |
| Down | fwd_return < -0.5% | Avoid / short |

Multi-class reduces per-class sample size. Use only with large datasets (1000+ samples per class).

### Probability Calibration

Raw model probabilities from XGBoost/LightGBM are not well-calibrated. A predicted 0.7 probability does not mean 70% chance of being correct. Use calibration to fix this:

```python
from sklearn.calibration import CalibratedClassifierCV

calibrated = CalibratedClassifierCV(base_model, cv=5, method="isotonic")
calibrated.fit(X_train, y_train)
probs = calibrated.predict_proba(X_test)[:, 1]
```

Isotonic calibration works better than Platt scaling for tree models.

## Walk-Forward Validation

**This is the single most important concept in trading ML.** Standard cross-validation randomly shuffles data, which creates lookahead bias. Walk-forward validation respects time ordering.

### How It Works

```
Window 1: [===TRAIN===][GAP][=TEST=]
Window 2:    [===TRAIN===][GAP][=TEST=]
Window 3:       [===TRAIN===][GAP][=TEST=]
Window 4:          [===TRAIN===][GAP][=TEST=]
```

Each window:
1. Train on past N bars
2. Skip a gap (embargo) equal to the forward return horizon
3. Predict on next M bars
4. Record out-of-sample predictions
5. Slide forward and repeat

### Typical Parameters

| Parameter | Value | Rationale |
|-----------|-------|-----------|
| Train window | 30 days (720 hourly bars) | Enough data to learn, recent enough to be relevant |
| Test window | 7 days (168 hourly bars) | Enough predictions for statistical significance |
| Step size | 1 day (24 bars) | Overlap test windows for more data points |
| Gap (embargo) | Same as forward horizon | Prevents label leakage |

### Walk-Forward Implementation

```python
from typing import Iterator

def walk_forward_splits(
    n_samples: int,
    train_size: int = 720,
    test_size: int = 168,
    step_size: int = 24,
    gap: int = 24,
) -> Iterator[tuple[np.ndarray, np.ndarray]]:
    """Generate walk-forward train/test index splits.

    Args:
        n_samples: Total number of samples.
        train_size: Number of training samples per window.
        test_size: Number of test samples per window.
        step_size: Step between successive windows.
        gap: Gap between train end and test start.

    Yields:
        Tuples of (train_indices, test_indices).
    """
    start = 0
    while start + train_size + gap + test_size <= n_samples:
        train_idx = np.arange(start, start + train_size)
        test_start = start + train_size + gap
        test_idx = np.arange(test_start, test_start + test_size)
        yield train_idx, test_idx
        start += step_size
```

See `references/validation_methods.md` for purged CV, CPCV, and evaluation metrics.

## Model Training Pipeline

### Full Pipeline Overview

1. **Feature engineering** — compute technical indicators, on-chain metrics, volume features (see `feature-engineering` skill)
2. **Label creation** — forward returns with threshold, drop neutral zone
3. **Walk-forward split** — time-ordered train/test windows with gap
4. **Train model** — XGBoost or LightGBM on each training window
5. **Predict on test** — generate out-of-sample probability predictions
6. **Aggregate predictions** — concatenate all out-of-sample results
7. **Evaluate** — accuracy, precision, recall, F1, AUC, profit factor

### Quick Training Example

```python
from xgboost import XGBClassifier

model = XGBClassifier(
    n_estimators=200,
    max_depth=4,
    learning_rate=0.05,
    subsample=0.8,
    colsample_bytree=0.8,
    eval_metric="logloss",
    use_label_encoder=False,
    random_state=42,
)

model.fit(
    X_train, y_train,
    eval_set=[(X_val, y_val)],
    verbose=False,
)

probabilities = model.predict_proba(X_test)[:, 1]
```

See `references/model_guide.md` for parameter recommendations and tuning.

## SHAP Feature Importance

SHAP (SHapley Additive exPlanations) provides the gold standard for understanding model predictions.

### Global Feature Importance

Which features matter most across all predictions:

```python
import shap

explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Summary plot (top 15 features)
shap.summary_plot(shap_values, X_test, max_display=15)
```

### Local Explanations

Why a specific prediction was made:

```python
# Explain a single prediction
shap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])
```

### Temporal Feature Importance

Track how feature importance drifts over walk-forward windows. If a feature's importance drops significantly, the market regime may have shifted.

## Threshold Optimization

The default 0.5 probability threshold is almost never optimal for trading.

### Why Not 0.5?

- Class imbalance: if 60% of labels are "up", a 0.5 threshold is too aggressive
- Trading costs: marginal signals (0.51 probability) rarely cover transaction costs
- Asymmetric payoffs: precision matters more than recall for trading

### Optimize for Profit Factor

```python
def optimize_threshold(
    probabilities: np.ndarray,
    returns: np.ndarray,
    thresholds: np.ndarray | None = None,
) -> tuple[float, float]:
    """Find threshold that maximizes profit factor.

    Args:
        probabilities: Model predicted probabilities.
        returns: Actual forward returns.
        thresholds: Thresholds to search over.

    Returns:
        Tuple of (best_threshold, best_profit_factor).
    """
    if thresholds is None:
        thresholds = np.arange(0.50, 0.85, 0.01)
    best_threshold, best_pf = 0.5, 0.0
    for t in thresholds:
        signals = probabilities >= t
        if signals.sum() < 10:
            continue
        signal_returns = returns[signals]
        wins = signal_returns[signal_returns > 0].sum()
        losses = abs(signal_returns[signal_returns < 0].sum())
        pf = wins / losses if losses > 0 else 0.0
        if pf > best_pf:
            best_pf = pf
            best_threshold = t
    return best_threshold, best_pf
```

Typical finding: optimal threshold is 0.60-0.75 for crypto trading signals.

## Crypto-Specific Considerations

### Short Training Windows

Crypto market regimes change fast. A model trained on 6 months of data may perform worse than one trained on 30 days. Use shorter training windows and retrain frequently.

### Class Imbalance

Most time periods are "flat" (returns within the neutral zone). Strategies to handle this:

- **Drop neutral zone**: only train on clear up/down labels
- **Undersample majority class**: `scale_pos_weight` in XGBoost
- **SMOTE**: synthetic minority oversampling (use cautiously — can introduce lookahead)
- **Adjust threshold**: raise the probability threshold to compensate

### Transaction Costs

A model with 55% accuracy sounds good, but after 0.5% round-trip costs (slippage + fees), many signals become unprofitable. Always evaluate signals net of costs:

```python
net_return = gross_return - 0.005  # 50 bps round-trip
```

### Feature Decay

Features lose predictive power over time as more participants discover and trade on them. Monitor rolling performance and retrain when metrics degrade.

## Integration with Other Skills

| Skill | Integration |
|-------|-------------|
| `feature-engineering` | Compute input features for the classifier |
| `vectorbt` | Backtest trading strategies from ML signals |
| `regime-detection` | Train separate models per regime, or use regime as a feature |
| `position-sizing` | Size positions based on classifier confidence |
| `risk-management` | Apply portfolio-level risk limits to ML-generated signals |

## Files

### References
- `references/model_guide.md` — XGBoost and LightGBM parameter guide, tuning, and ensembling
- `references/validation_methods.md` — Walk-forward, purged CV, CPCV, and evaluation metrics

### Scripts
- `scripts/train_classifier.py` — Train a signal classifier with walk-forward validation and feature importance
- `scripts/walk_forward_backtest.py` — Backtest ML signals vs buy-and-hold with walk-forward validation

## Dependencies

```bash
# Core (required)
uv pip install pandas numpy scikit-learn

# Optional (recommended)
uv pip install xgboost lightgbm shap
```

## Key Takeaways

1. **Walk-forward validation is non-negotiable** — random CV will give you wildly inflated results
2. **Optimize threshold for profit factor**, not accuracy — a high-precision, low-recall model beats a high-accuracy one
3. **Short training windows** for crypto — 30 days beats 6 months in most regimes
4. **Monitor feature decay** — retrain when rolling metrics drop below baseline
5. **Always evaluate net of costs** — a 55% accurate model may be unprofitable after fees
6. **SHAP over raw feature importance** — SHAP gives consistent, theoretically grounded explanations

Related Skills

yield-analysis

from agiprolabs/claude-trading-skills

DeFi yield evaluation including fee APR, real vs nominal yield, net APY after costs, and yield sustainability analysis

yellowstone-grpc

from agiprolabs/claude-trading-skills

Real-time Solana transaction and account streaming via Yellowstone gRPC (Geyser plugin)

whale-tracking

from agiprolabs/claude-trading-skills

Large wallet monitoring, accumulation and distribution detection, and smart money signal generation for Solana tokens

wash-sale-detection

from agiprolabs/claude-trading-skills

Wash sale detection under 2025 US crypto rules with 61-day window monitoring, disallowed loss tracking, and safe re-entry countdown

wallet-profiling

from agiprolabs/claude-trading-skills

Behavioral classification, performance analysis, and trading style detection for Solana wallets

walk-forward-validation

from agiprolabs/claude-trading-skills

Walk-forward validation framework for trading strategies and ML models with time-series-aware splits, overfit detection, and regime-aware validation

volatility-modeling

from agiprolabs/claude-trading-skills

Volatility estimation, forecasting, and regime classification using GARCH, EWMA, realized volatility, and volatility cones

vectorbt

from agiprolabs/claude-trading-skills

High-performance vectorized backtesting with parameter optimization, portfolio simulation, and rich performance metrics

trading-visualization

from agiprolabs/claude-trading-skills

Professional trading charts including candlesticks, equity curves, drawdowns, correlation heatmaps, and return distributions

trade-journal

from agiprolabs/claude-trading-skills

Structured trade logging, performance review, behavioral pattern detection, and strategy attribution for systematic improvement

trade-accounting

from agiprolabs/claude-trading-skills

Double-entry bookkeeping for trading operations with ledger management, P&L statements, balance sheets, and cash flow reporting

token-holder-analysis

from agiprolabs/claude-trading-skills

Token holder distribution, concentration metrics, insider detection, and supply analysis for Solana tokens