Data Analysis

Turn raw data into decisions with statistical rigor, proper methodology, and awareness of analytical pitfalls.

1,864 stars

Best use case

Data Analysis is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Turn raw data into decisions with statistical rigor, proper methodology, and awareness of analytical pitfalls.

Teams using Data Analysis should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/data-analysis/SKILL.md --create-dirs "https://raw.githubusercontent.com/LeoYeAI/openclaw-master-skills/main/skills/data-analysis/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/data-analysis/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How Data Analysis Compares

Feature / AgentData AnalysisStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Turn raw data into decisions with statistical rigor, proper methodology, and awareness of analytical pitfalls.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

## When to Load

User asks about: analyzing data, finding patterns, understanding metrics, testing hypotheses, cohort analysis, A/B testing, churn analysis, statistical significance.

## Core Principle

Analysis without a decision is just arithmetic. Always clarify: **What would change if this analysis shows X vs Y?**

## Methodology First

Before touching data:
1. **What decision** is this analysis supporting?
2. **What would change your mind?** (the real question)
3. **What data do you actually have** vs what you wish you had?
4. **What timeframe** is relevant?

## Statistical Rigor Checklist

- [ ] Sample size sufficient? (small N = wide confidence intervals)
- [ ] Comparison groups fair? (same time period, similar conditions)
- [ ] Multiple comparisons? (20 tests = 1 "significant" by chance)
- [ ] Effect size meaningful? (statistically significant ≠ practically important)
- [ ] Uncertainty quantified? ("12-18% lift" not just "15% lift")

## Analytical Pitfalls to Catch

| Pitfall | What it looks like | How to avoid |
|---------|-------------------|--------------|
| Simpson's Paradox | Trend reverses when you segment | Always check by key dimensions |
| Survivorship bias | Only analyzing current users | Include churned/failed in dataset |
| Comparing unequal periods | Feb (28d) vs March (31d) | Normalize to per-day or same-length windows |
| p-hacking | Testing until something is "significant" | Pre-register hypotheses or adjust for multiple comparisons |
| Correlation in time series | Both went up = "related" | Check if controlling for time removes relationship |
| Aggregating percentages | Averaging percentages directly | Re-calculate from underlying totals |

For detailed examples of each pitfall, see `pitfalls.md`.

## Approach Selection

| Question type | Approach | Key output |
|---------------|----------|------------|
| "Is X different from Y?" | Hypothesis test | p-value + effect size + CI |
| "What predicts Z?" | Regression/correlation | Coefficients + R² + residual check |
| "How do users behave over time?" | Cohort analysis | Retention curves by cohort |
| "Are these groups different?" | Segmentation | Profiles + statistical comparison |
| "What's unusual?" | Anomaly detection | Flagged points + context |

For technique details and when to use each, see `techniques.md`.

## Output Standards

1. **Lead with the insight**, not the methodology
2. **Quantify uncertainty** — ranges, not point estimates
3. **State limitations** — what this analysis can't tell you
4. **Recommend next steps** — what would strengthen the conclusion

## Red Flags to Escalate

- User wants to "prove" a predetermined conclusion
- Sample size too small for reliable inference
- Data quality issues that invalidate analysis
- Confounders that can't be controlled for

Related Skills

us-stock-analysis

1864
from LeoYeAI/openclaw-master-skills

Comprehensive US stock analysis including fundamental analysis (financial metrics, business quality, valuation), technical analysis (indicators, chart patterns, support/resistance), stock comparisons, and investment report generation. Use when user requests analysis of US stock tickers (e.g., "analyze AAPL", "compare TSLA vs NVDA", "give me a report on Microsoft"), evaluation of financial metrics, technical chart analysis, or investment recommendations for American stocks.

ths-advanced-analysis

1864
from LeoYeAI/openclaw-master-skills

基于 thsdk 进行高级股票分析:分钟K线(1m/5m/15m/30m/60m/120m)、板块/指数行情(主要指数/申万行业/概念板块成分股)、多股票批量对比(表格+归一化走势图+相关性热力图)、盘口深度、大单流向、集合竞价异动、日内分时、历史分时。当用户提到"分钟K线"、"日内走势"、"盘口"、"大单"、"竞价异动"、"板块行情"、"行业排名"、"概念板块"、"成分股"、"对比多只股票"、"批量分析"、"涨幅对比"、"相关性",或者需要同时查看2只以上股票、关注短线交易、量化研究时,必须使用此skill。

tech-data-playbook

1864
from LeoYeAI/openclaw-master-skills

World-Class Technology & Data Playbook. Use for: software development best practices, IT infrastructure design, cybersecurity strategy, data analytics, business intelligence, automation & DevOps, cloud computing architecture, AI/ML adoption, technical architecture decisions, digital transformation strategy, platform engineering, CI/CD pipelines, zero-trust security, data governance, FinOps, edge computing, observability, MLOps, and technology leadership. Trigger when discussing ANY technology strategy, engineering practice, data platform, security posture, cloud architecture, AI implementation, or digital transformation topic. If in doubt, use this skill.

stock-analysis

1864
from LeoYeAI/openclaw-master-skills

Analyze stocks and cryptocurrencies using Yahoo Finance data. Supports portfolio management, watchlists with alerts, dividend analysis, 8-dimension stock scoring, viral trend detection (Hot Scanner), and rumor/early signal detection. Use for stock analysis, portfolio tracking, earnings reactions, crypto monitoring, trending stocks, or finding rumors before they hit mainstream.

senior-data-scientist

1864
from LeoYeAI/openclaw-master-skills

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.

senior-data-engineer

1864
from LeoYeAI/openclaw-master-skills

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

native-data-fetching

1864
from LeoYeAI/openclaw-master-skills

Use when implementing or debugging ANY network request, API call, or data fetching. Covers fetch API, React Query, SWR, error handling, caching, offline support, and Expo Router data loaders (useLoaderData).

eastmoney_financial_data

1864
from LeoYeAI/openclaw-master-skills

本 Skill 基于东方财富权威数据库及最新行情底层数据构建,支持通过自然语言查询行情类数据(股票、行业、板块、指数、基金、债券的实时行情、主力资金流向、估值等)、财务类数据(上市公司基本信息、财务指标、高管信息、主营业务等)、关系与经营类数据(关联关系、企业经营数据)。避免模型基于过时知识回答金融数据问题,提供权威及时的金融数据。

database-schema-designer

1864
from LeoYeAI/openclaw-master-skills

Database Schema Designer

database-designer

1864
from LeoYeAI/openclaw-master-skills

Database Designer - POWERFUL Tier Skill

database-admin

1864
from LeoYeAI/openclaw-master-skills

Comprehensive database administration, schema management, data operations, and optimization. Use when Codex needs to: (1) Create or modify database tables with proper indexing, (2) Perform bulk data insertions with type safety and constraint handling, (3) Execute complex queries with JOINs, aggregations, and subqueries, (4) Optimize query performance through indexing and execution plan analysis, (5) Manage database backups, restores, and migrations, (6) Handle special data types (BIGINT, UUID, JSONB, enums), (7) Implement transactional safety with ACID compliance, or (8) Debug and resolve database errors including constraint violations, type mismatches, and foreign key issues

data-analyst

1864
from LeoYeAI/openclaw-master-skills

Data visualization, report generation, SQL queries, and spreadsheet automation. Transform your AI agent into a data-savvy analyst that turns raw data into actionable insights.