A/B Test Design
Statistical experiment design and analysis capabilities for product experimentation
Best use case
A/B Test Design is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Statistical experiment design and analysis capabilities for product experimentation
Teams using A/B Test Design should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ab-test-design/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How A/B Test Design Compares
| Feature / Agent | A/B Test Design | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Statistical experiment design and analysis capabilities for product experimentation
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# A/B Test Design Skill
## Overview
Specialized skill for statistical experiment design and analysis capabilities. Enables product teams to design rigorous experiments, calculate sample sizes, and interpret results with statistical confidence.
## Capabilities
### Experiment Design
- Calculate required sample sizes for experiments
- Design experiment variants and hypotheses
- Define success metrics and guardrail metrics
- Create experiment documentation templates
- Design multi-variant tests (A/B/n)
- Plan sequential and Bayesian experiments
### Statistical Analysis
- Validate statistical significance of results
- Calculate practical significance and effect sizes
- Detect interaction effects and segments
- Perform power analysis
- Calculate confidence intervals
- Handle multiple comparison corrections
### Decision Support
- Recommend ship/iterate/kill decisions
- Identify segment-specific impacts
- Assess long-term vs short-term effects
- Generate experiment reports
- Track experiment velocity metrics
## Target Processes
This skill integrates with the following processes:
- `product-market-fit.js` - Validation experiments for PMF hypotheses
- `conversion-funnel-analysis.js` - Funnel optimization experiments
- `beta-program.js` - A/B testing during beta phases
## Input Schema
```json
{
"type": "object",
"properties": {
"experimentType": {
"type": "string",
"enum": ["ab", "multivariate", "sequential", "bandit"],
"description": "Type of experiment to design"
},
"hypothesis": {
"type": "string",
"description": "Hypothesis to test"
},
"primaryMetric": {
"type": "object",
"properties": {
"name": { "type": "string" },
"baseline": { "type": "number" },
"mde": { "type": "number", "description": "Minimum detectable effect" }
}
},
"guardrailMetrics": {
"type": "array",
"items": { "type": "string" },
"description": "Metrics that should not regress"
},
"trafficAllocation": {
"type": "number",
"description": "Percentage of traffic for experiment"
},
"confidenceLevel": {
"type": "number",
"default": 0.95,
"description": "Statistical confidence level"
}
},
"required": ["experimentType", "hypothesis", "primaryMetric"]
}
```
## Output Schema
```json
{
"type": "object",
"properties": {
"experimentPlan": {
"type": "object",
"properties": {
"name": { "type": "string" },
"hypothesis": { "type": "string" },
"variants": { "type": "array", "items": { "type": "object" } },
"sampleSize": { "type": "number" },
"duration": { "type": "string" },
"metrics": { "type": "object" }
}
},
"powerAnalysis": {
"type": "object",
"properties": {
"requiredSampleSize": { "type": "number" },
"estimatedDuration": { "type": "string" },
"power": { "type": "number" }
}
},
"implementation": {
"type": "object",
"properties": {
"trackingEvents": { "type": "array", "items": { "type": "string" } },
"segmentation": { "type": "array", "items": { "type": "string" } },
"rolloutPlan": { "type": "string" }
}
},
"analysisFramework": {
"type": "object",
"properties": {
"primaryAnalysis": { "type": "string" },
"secondaryAnalyses": { "type": "array", "items": { "type": "string" } },
"decisionCriteria": { "type": "object" }
}
}
}
}
```
## Usage Example
```javascript
const experimentDesign = await executeSkill('ab-test-design', {
experimentType: 'ab',
hypothesis: 'Adding social proof to pricing page increases conversion by 10%',
primaryMetric: {
name: 'pricing_page_conversion',
baseline: 0.05,
mde: 0.10
},
guardrailMetrics: ['revenue_per_visitor', 'bounce_rate'],
trafficAllocation: 50,
confidenceLevel: 0.95
});
```
## Dependencies
- Statistical libraries for power analysis
- Experimentation platform integrations (Optimizely, LaunchDarkly, etc.)Related Skills
vitest
Vitest configuration, mocking, coverage, snapshot testing, and performance.
rest-api-design
RESTful API design principles, versioning, pagination, HATEOAS, and documentation.
react-testing-library
React Testing Library patterns, queries, user events, and accessibility testing.
design-tokens
Design token management, generation, and multi-platform support.
design-token-transformer
Transform design tokens across multiple formats and platforms. Parse W3C design token format, transform to CSS/SCSS/JS/iOS/Android, handle token aliases and references, and generate documentation.
design-system-validator
Validate design system compliance in code and detect token usage violations
load-test-generator
Generate load test scripts for k6, Locust, and Gatling from OpenAPI specs
cloud-security-testing
Multi-cloud security assessment and penetration testing capabilities. Execute Prowler/ScoutSuite assessments, analyze IAM policies, identify cloud misconfigurations, test permissions, and enumerate cloud resources across AWS/GCP/Azure.
scope-permission-designer
Design and implement scoped permission models
rate-limiter-designer
Design and implement rate limiting strategies
protobuf-grpc-designer
Protocol Buffers and gRPC service definition with backward compatibility checks
middleware-chain-designer
Design middleware and interceptor chains for SDK extensibility