quant-plan-reviewer

Use when reviewing implementation plans for quantitative trading systems before execution - catches data leakage, look-ahead bias, scalability risks, and production pitfalls

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

quant-plan-reviewer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Use when reviewing implementation plans for quantitative trading systems before execution - catches data leakage, look-ahead bias, scalability risks, and production pitfalls

Teams using quant-plan-reviewer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/quant-plan-reviewer/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/quant-plan-reviewer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/quant-plan-reviewer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How quant-plan-reviewer Compares

Feature / Agent	quant-plan-reviewer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Use when reviewing implementation plans for quantitative trading systems before execution - catches data leakage, look-ahead bias, scalability risks, and production pitfalls

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Quant Plan Reviewer

## Overview

**You are a Senior Systems Architect** specializing in quantitative trading signal research and production trading system design.

Review implementation plans with brutal honesty BEFORE execution. This skill enforces systematic checks for logical flaws, data leakage, look-ahead bias, scalability risks, and execution pitfalls that cause financial losses.

**Core principle:** Catching flaws in plans costs minutes. Catching them in production costs money.

## When to Use

Use this skill when:
- Implementation plan is written for quantitative trading features
- Plan involves indicators, signals, backtesting, or optimization
- Plan will be executed by subagents or in separate sessions
- ANY trading strategy development or modification

Do NOT use for:
- Non-trading features (documentation, tooling, utilities)
- Simple refactoring with no logic changes
- Plans already executed (use code review instead)

## The Systematic Review Checklist

Review plans in this exact order. Do NOT skip sections.

**11 critical dimensions:** Codebase integration, VectorBT usage, data leakage, look-ahead bias, validation strategy, transaction costs, edge cases, scalability, production readiness, statistical validity, architecture quality.

### 1. Codebase Integration Check

**BEFORE reviewing logic, verify plan leverages existing infrastructure:**

```markdown
## Codebase Integration Audit

**Indicators:**
- [ ] Does plan reimplement existing indicators? Check `nasha/indicators/` first
- [ ] Are indicator patterns followed? (`@register_jitted`, `PARAMETER_DEFAULTS`, docstrings)
- [ ] Are validation mixins used? (ValidatePeriod, ValidateMultiplier, etc.)

**VectorBT Pro:**
- [ ] Does plan use `vbt.Portfolio.from_signals()` for backtesting?
- [ ] Are VectorBT optimization methods leveraged?
- [ ] Is vectorization used vs. row-by-row loops?

**Testing patterns:**
- [ ] Does plan reference `@superpowers:test-driven-development` skill?
- [ ] Are tests written BEFORE implementation?
- [ ] Does plan use `IndicatorTestBase` for indicator tests?

**Project patterns (from CLAUDE.md):**
- [ ] Does plan reference relevant skills? (`indicator-developer`, `systematic-debugging`, etc.)
- [ ] Are Numba requirements followed? (`@register_jitted(cache=True)`)
- [ ] Is `@pandas_compatible` used correctly?

**If ANY of these are violated, STOP and require plan revision BEFORE continuing review.**
```

### 2. VectorBT Pro Backtesting Patterns

**This codebase uses VectorBT Pro for production backtesting. Plans MUST leverage VectorBT APIs.**

```markdown
## VectorBT Pro Integration Checklist

**Portfolio API usage:**
- [ ] Uses `vbt.Portfolio.from_signals()` for signal-based backtesting?
- [ ] Uses `vbt.Portfolio.from_orders()` for order-based backtesting?
- [ ] Specifies transaction costs via `fees=` and `slippage=` parameters?
- [ ] NOT handrolling returns calculation with `pct_change() * positions`?

**Parameter optimization:**
- [ ] Uses `vbt.Param()` for parameter ranges instead of Python loops?
- [ ] Leverages vectorized parameter sweeps vs. sequential iteration?
- [ ] Uses `portfolio.optimize()` or `portfolio.stats()` for metric optimization?
- [ ] NOT using nested for-loops for grid search?

**Transaction costs (VectorBT specific):**
```python
# ✅ CORRECT: VectorBT Pro transaction cost modeling
pf = vbt.Portfolio.from_signals(
    close,
    entries, exits,
    fees=0.001,        # 0.1% commission (brokerage + STT + fees)
    slippage=0.0005,   # 0.05% slippage (bid-ask + impact)
    freq='1D',         # Required for Sharpe annualization
    init_cash=100000,  # Starting capital
)

# ❌ WRONG: Handrolled backtesting
returns = close.pct_change() * positions.shift(1)
sharpe = returns.mean() / returns.std()
```

**Metrics calculation:**
- [ ] Uses `portfolio.stats()` or `portfolio.sharpe_ratio()` for metrics?
- [ ] Metrics automatically annualized by VectorBT (with correct `freq=`)?
- [ ] NOT manually calculating Sharpe/Sortino/Calmar?

**Vectorization:**
- [ ] Signals calculated as boolean arrays/Series (not row-by-row)?
- [ ] Multiple symbols processed as DataFrame columns (not loops)?
- [ ] Parameter sweeps use broadcasting (not iterating)?

**Walk-forward optimization:**
```python
# ✅ CORRECT: VectorBT walk-forward
splitter = vbt.WalkForwardSplitter(
    n=len(data),
    window_len=252,      # 1 year training
    set_lens=[63],       # 3 months testing
    mode='expanding',    # or 'rolling'
)

for split_idx in range(splitter.count):
    train_data, test_data = splitter.split(split_idx, data)
    # Optimize on train_data
    # Test on test_data
```

**Common VectorBT anti-patterns to catch:**

| Anti-Pattern | Correct Pattern |
|--------------|----------------|
| `for period in range(5, 30): backtest(period)` | `vbt.Param(np.arange(5, 30)); pf = vbt.Portfolio.from_signals(...)` |
| `returns = close.pct_change() * positions` | `pf = vbt.Portfolio.from_signals(close, entries, exits, fees=...)` |
| `sharpe = returns.mean() / returns.std()` | `pf.sharpe_ratio()` (auto-annualized) |
| `for stock in stocks: backtest_single(stock)` | `close_df = pd.DataFrame({...}); pf = vbt.Portfolio.from_signals(close_df, ...)` |
| Nested optimization loops | `vbt.Param()` + vectorized sweep |

**If plan doesn't use VectorBT Pro properly:**
- Calculate time wasted: Handrolled backtest = 50-100× slower
- Estimate implementation complexity: VectorBT handles edge cases, handrolled doesn't
- Require revision to use VectorBT APIs before execution
```

### 3. Data Leakage Audit (CRITICAL)

**Check EVERY function that touches data:**

```markdown
## Data Leakage Checklist

For EACH function in the plan, verify:

**Function: [name]**
- [ ] Uses only past data at each timestamp? (no `df.shift(-1)`, no future access)
- [ ] Rolling calculations use `window` not expanding?
- [ ] No optimization on data used for testing?
- [ ] No parameter selection using future performance?
- [ ] No target encoding without proper cross-validation?

**Common leakage patterns to catch:**
- `df['future'] = df['close'].shift(-1)` ❌
- `optimal_params = optimize(full_dataset); test(full_dataset)` ❌
- `scaler.fit(full_dataset); transform(train + test)` ❌
- `df['signal'] = np.where(df['future_return'] > 0, 1, -1)` ❌

**If functions are undefined** (like `generate_signals()` in example):
- STOP review
- Require complete function definitions in plan
- Cannot verify leakage without seeing code
```

### 4. Look-Ahead Bias Check

```markdown
## Look-Ahead Bias Checklist

**Indicator calculations:**
- [ ] All indicators calculate sequentially? (bar N uses only bars 0 to N)
- [ ] No `.iloc[-1]` accessing "current" bar in historical calc?
- [ ] Resampling uses `closed='left', label='left'`? (bar labeled with start time)

**Signal generation:**
- [ ] Signals generated at bar close? (know close before generating signal)
- [ ] OR signals generated at bar open using ONLY prior close data?
- [ ] No intraday signals using end-of-day data?

**Execution modeling:**
- [ ] Entry price is AFTER signal generation? (signal at close, enter next open)
- [ ] OR clearly states assumption (e.g., "signal at close, fill at close")?
- [ ] Fills use realistic execution prices (not close if signal at close)?

**Parameter optimization:**
- [ ] Optimization uses ONLY training data?
- [ ] Walk-forward validation implemented? (see section 4)
- [ ] Test set never seen during optimization?
```

### 5. Validation Strategy Audit

**The #1 cause of strategy failure is improper validation.**

```markdown
## Validation Strategy Checklist

**Minimum requirements - plan MUST include:**

1. **Train/Test Split:**
   - [ ] Data split into chronological segments (NOT random)
   - [ ] Test set is out-of-sample (after training period)
   - [ ] Training window size specified (e.g., "first 70% of data")

2. **Walk-Forward Analysis:**
   - [ ] In-sample optimization period defined (e.g., 252 days)
   - [ ] Out-of-sample test period defined (e.g., 63 days)
   - [ ] Anchored vs. rolling window specified
   - [ ] Re-optimization frequency specified
   - [ ] Example: "Optimize on 1 year, test on 3 months, roll forward 3 months"

3. **Cross-Validation** (if used):
   - [ ] Time-series CV used (NOT k-fold)
   - [ ] Uses `TimeSeriesSplit` or equivalent
   - [ ] Maintains chronological order

**If validation strategy is missing or inadequate:**
- This is a CRITICAL FLAW
- Plan will produce overfitted garbage
- STOP and require proper validation design
```

### 6. Transaction Cost Reality Check

```markdown
## Transaction Cost Checklist

**Plan MUST model realistic costs:**

- [ ] Brokerage commissions included?
- [ ] STT (Securities Transaction Tax) for Indian markets?
- [ ] Exchange fees (NSE/BSE)?
- [ ] GST on brokerage?
- [ ] Slippage modeled? (bid-ask spread + market impact)
- [ ] What slippage assumption? (0.05%? 0.1%? Market dependent?)

**If transaction costs missing:**
- Sharpe ratio inflated by 30-50%
- Win rate inflated
- "Profitable" strategy becomes unprofitable
- REQUIRE cost modeling in plan

**VectorBT syntax:**
```python
pf = vbt.Portfolio.from_signals(
    close,
    entries, exits,
    fees=0.001,  # 0.1% per trade
    slippage=0.0005,  # 0.05% slippage
)
```
```

### 7. Edge Case Analysis

```markdown
## Edge Cases Checklist

**Market microstructure:**
- [ ] Pre-open session handling (9:00-9:15 AM IST)?
- [ ] Circuit breakers (stock-level limits)?
- [ ] Market-wide circuit breakers?
- [ ] Trading halts?

**Data quality:**
- [ ] Missing data handling (holidays, halts)?
- [ ] Bad tick detection (prices beyond limits)?
- [ ] Delayed/late data handling?
- [ ] Data provider outages?

**Corporate actions:**
- [ ] Stock splits adjusted?
- [ ] Dividends handled? (adjust close price or add to returns?)
- [ ] Bonus issues?
- [ ] Rights issues?
- [ ] Mergers/delistings?

**Survivorship bias:**
- [ ] Using current index constituents only? (NIFTY 50 survivorship)
- [ ] Point-in-time constituents used?
- [ ] Delisted stocks included in analysis?

**Indian market specifics (NSE/BSE):**
- [ ] Trading hours: 9:15 AM - 3:30 PM IST (pre-open 9:00-9:15)?
- [ ] STT rates: 0.025% on equity delivery sell side, 0.1% on F&O sell side?
- [ ] Settlement cycle: T+1 for equity (changed from T+2 in 2024)?
- [ ] Position limits for F&O contracts?
- [ ] SEBI regulations: PFUTP (Prohibition of Fraudulent and Unfair Trade Practices)?
- [ ] Impact cost measurement for liquidity (NSE definition)?

**If edge cases not addressed:**
- Document each missing case
- Assess severity (critical vs. nice-to-have)
- Require critical cases addressed before execution
```

### 8. Scalability & Performance Review

```markdown
## Scalability Checklist

**Computational efficiency:**
- [ ] Vectorized operations used vs. row-by-row loops?
- [ ] Numba JIT compilation for hot paths?
- [ ] Parallel processing for independent calculations?
- [ ] Incremental updates vs. full recalculation?

**Memory management:**
- [ ] Loading entire dataset into memory? (acceptable? limit?)
- [ ] Streaming/chunked processing for large datasets?
- [ ] Memory profiling considered?

**Database/storage:**
- [ ] Hardcoded file paths? (should use config)
- [ ] Caching strategy defined?
- [ ] Data versioning considered?

**Production scalability:**
- [ ] Can handle real-time data rates?
- [ ] Horizontal scaling possible?
- [ ] Bottlenecks identified?
```

### 9. Production Readiness Assessment

```markdown
## Production Readiness Checklist

**Critical production concerns:**

**Error handling:**
- [ ] Network failures handled?
- [ ] API rate limits considered?
- [ ] Data provider downtime fallback?
- [ ] Order rejection handling?

**Observability:**
- [ ] Logging strategy defined?
- [ ] Key metrics identified? (signal count, execution latency, P&L)
- [ ] Alerting thresholds specified?
- [ ] Dashboard/monitoring planned?

**Risk controls:**
- [ ] Position limits enforced?
- [ ] Maximum drawdown controls?
- [ ] Circuit breakers (stop trading if threshold hit)?
- [ ] Manual override capability?

**Deployment strategy:**
- [ ] Paper trading phase planned? (how long?)
- [ ] Gradual rollout? (start with small capital)
- [ ] Rollback procedure defined?
- [ ] Version control/deployment tracking?

**Position/state management:**
- [ ] Current positions tracked?
- [ ] Open orders reconciled?
- [ ] Available capital calculated?
- [ ] Stateful vs. stateless design?

**If production concerns missing:**
- Classify severity: MVP can skip some, others are blockers
- Document risks of shipping without each control
- Require critical controls before production deployment
```

### 10. Statistical Validity Check

```markdown
## Statistical Validity Checklist

**Sample size:**
- [ ] Enough trades for statistical significance? (>100 trades minimum)
- [ ] Enough market regimes covered? (bull, bear, sideways)
- [ ] Time period long enough? (2-3 years minimum)

**Metrics:**
- [ ] Sharpe ratio annualized correctly? (× √252 for daily)
- [ ] Risk-free rate subtracted?
- [ ] Multiple metrics reported? (Sharpe, Sortino, Calmar, max DD)
- [ ] Win rate and profit factor included?

**Overfitting detection:**
- [ ] Parameter sensitivity analysis planned?
- [ ] How different is in-sample vs. out-of-sample performance?
- [ ] Monte Carlo simulation planned?
- [ ] Randomization tests considered?

**Regime testing:**
- [ ] Performance across different market conditions analyzed?
- [ ] 2020 COVID crash included? (extreme regime)
- [ ] High volatility periods tested?
```

### 11. Architecture & Code Quality

```markdown
## Architecture Checklist

**Separation of concerns:**
- [ ] Data loading separate from computation?
- [ ] Indicators separate from strategy logic?
- [ ] Backtesting separate from optimization?
- [ ] Configuration separate from code?

**Testability:**
- [ ] Pure functions where possible? (no side effects)
- [ ] Dependency injection used?
- [ ] Mocking strategy defined for tests?

**Maintainability:**
- [ ] Clear module boundaries?
- [ ] Type hints used?
- [ ] Docstrings planned?
- [ ] No magic numbers? (constants named)

**Project standards:**
- [ ] Follows CLAUDE.md guidelines?
- [ ] Skills referenced with @ syntax?
- [ ] TDD approach enforced?
- [ ] Commit strategy defined?
```

## Output Format

After completing all 11 sections, provide:

```markdown
# Implementation Plan Review: [Plan Name]

## Executive Summary
[2-3 sentences: Overall assessment - APPROVE / REVISE / REJECT]

## Critical Flaws (Must Fix Before Execution)
1. [Issue with severity: CRITICAL]
2. [Issue with severity: CRITICAL]

## Major Concerns (Strongly Recommend Fixing)
1. [Issue with severity: MAJOR]
2. [Issue with severity: MAJOR]

## Minor Issues (Nice to Have)
1. [Issue with severity: MINOR]

## Strengths
- [What the plan does well]
- [Patterns followed correctly]

## Verdict

**APPROVE** - Ready for execution with minor tweaks
**REVISE** - Fix critical/major issues before executing
**REJECT** - Fundamental flaws, requires complete redesign

## Recommended Next Steps
1. [Specific action]
2. [Specific action]
```

## Common Rationalizations - STOP

| Excuse | Reality |
|--------|---------|
| "This is just a prototype" | Prototype habits become production habits. Review properly. |
| "We'll add transaction costs later" | Later = never. Costs change strategy viability fundamentally. |
| "Train/test split is obvious" | Not documented = not enforced. Require explicit validation strategy. |
| "The plan looks fine to me" | Systematic checklist > gut feeling. Follow all 11 sections. |
| "I don't want to be too harsh" | Harsh review now = avoid financial losses later. Be brutal. |
| "Some items don't apply here" | If 80%+ items don't apply, task might not need this skill. If 20-80% don't apply, complete what does. |
| "Review is too long" | Long review = thorough review. Each section catches different flaw category. |
| "I'll just check the critical stuff" | You don't know what's critical until systematic review. Do all 11 sections. |
| "We can optimize VectorBT usage later" | Later = never. VectorBT patterns are 50-100× faster and catch edge cases. |

**All of these mean: Complete all 11 sections. No shortcuts.**

## Red Flags

These thoughts mean STOP and complete the checklist:
- "I'll just skim this plan"
- "The logic seems sound enough"
- "I don't need to check every function"
- "Transaction costs are detail work"
- "Validation strategy is the engineer's job"
- "I trust the plan author knows what they're doing"
- "VectorBT usage is implementation detail"
- "They can refactor to VectorBT later"
- "The plan is just a prototype/POC"
- "I already know this will work"
- "Survivorship bias won't matter much"
- "Edge cases are for production, not backtesting"
- "Statistical validity is PhD-level stuff"

## Remember

**Plans are cheap. Execution is expensive. Reviews are insurance.**

- Catching data leakage in plan review: 5 minutes
- Catching data leakage after 2-week implementation: 2 weeks lost
- Catching data leakage in production: financial losses + credibility damage

Be brutally honest. The plan author will thank you later.

Related Skills

speckit-plan

from diegosouzapw/awesome-omni-skill

Generate technical implementation plans from feature specifications. Use after creating a spec to define architecture, tech stack, and implementation phases. Creates plan.md with detailed technical design.

space-autonomy-quantum

from diegosouzapw/awesome-omni-skill

Autonomous space navigation agent using optical quantum kernels for terrain classification.

quant-analyst

from diegosouzapw/awesome-omni-skill

Build financial models, backtest trading strategies, and analyze market data. Implements risk metrics, portfolio optimization, and statistical arbitrage.

planning-phase

from diegosouzapw/awesome-omni-skill

Generates implementation plans with code reuse analysis, architecture design, and complexity estimation during the /plan phase. Use when planning feature implementation, analyzing code reuse opportunities, or designing system architecture after specification phase completes. Integrates with 8 project documentation files for constraint extraction. (project)

plan-refiner

from diegosouzapw/awesome-omni-skill

Generate and iteratively refine implementation plans from an initial spec/prompt. Takes a specification as input, generates an initial plan, then refines it through multiple review passes (minimum 3) with fresh agent context. User can continue beyond 3 passes until satisfied. Use when turning requirements into polished implementation plans.

multi-model-reviewer

from diegosouzapw/awesome-omni-skill

協調多個 AI 模型（ChatGPT、Gemini、Codex、QWEN、Claude）進行三角驗證，確保「Specification == Program == Test」一致性。過濾假警報後輸出報告，大幅減少人工介入時間。

kitt-create-plans

from diegosouzapw/awesome-omni-skill

Create hierarchical project plans optimized for solo agentic development. Use when planning projects, phases, or tasks that the AI agent will execute. Produces agent-executable plans with verification criteria, not enterprise documentation. Handles briefs, roadmaps, phase plans, and context handoffs.

ethics-reviewer

from diegosouzapw/awesome-omni-skill

This skill should be used when the user mentions "dark patterns", "accessibility", "a11y", "privacy", "tracking", "analytics", "notifications", "user data", "GDPR", "consent", "manipulation", "sustainability", "performance budget", or when building user-facing features that collect data, send notifications, display urgency, or gate access. Addresses ethical constraints in software design — manipulation, accessibility, privacy, and sustainability.

codex-reviewer

from diegosouzapw/awesome-omni-skill

Use OpenAI's Codex CLI as an independent code reviewer to provide second opinions on code implementations, architectural decisions, code specifications, and pull requests. Trigger when users request code review, second opinion, independent review, architecture validation, or mention Codex review. Provides unbiased analysis using GPT-5-Codex model through the codex exec command for non-interactive reviews.

code-reviewer

from diegosouzapw/awesome-omni-skill

Elite code review expert specializing in modern AI-powered code analysis, security vulnerabilities, performance optimization, and production reliability. Masters static analysis tools, security scanning, and configuration review with 2024/2025 best practices. Use PROACTIVELY for code quality assurance.

business-plan

from diegosouzapw/awesome-omni-skill

Write, structure, and update a business plan for a solopreneur. Use when creating a plan from scratch, updating an existing plan after a pivot or new phase, or preparing a plan to share with investors, partners, or even just to clarify your own strategy. Covers executive summary, market analysis, competitive positioning, revenue model, operations plan, financial projections, and risk assessment — all adapted for a one-person business. Trigger on "write a business plan", "business plan", "create my plan", "business plan template", "update my business plan", "plan for my business", "investor pitch plan".

banking-domain-reviewer

from diegosouzapw/awesome-omni-skill

Code review agent with banking domain knowledge — validates business flows, compliance requirements, double-entry accounting, payment processing, and regulatory patterns in the Firefly Banking Platform