a-b-testing

The science of learning through controlled experimentation. A/B testing isn't about picking winners—it's about building a culture of validated learning and reducing the cost of being wrong. This skill covers experiment design, statistical rigor, feature flagging, analysis, and building experimentation into product development. The best experimenters know that every test, positive or negative, teaches something valuable. Use when "a/b test, experiment, hypothesis, statistical significance, sample size, feature flag, variant, control, treatment, p-value, conversion rate, test winner, split test, experimentation, testing, statistics, feature-flags, hypothesis, growth, optimization, learning, validation" mentioned.

16 stars

Best use case

a-b-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

The science of learning through controlled experimentation. A/B testing isn't about picking winners—it's about building a culture of validated learning and reducing the cost of being wrong. This skill covers experiment design, statistical rigor, feature flagging, analysis, and building experimentation into product development. The best experimenters know that every test, positive or negative, teaches something valuable. Use when "a/b test, experiment, hypothesis, statistical significance, sample size, feature flag, variant, control, treatment, p-value, conversion rate, test winner, split test, experimentation, testing, statistics, feature-flags, hypothesis, growth, optimization, learning, validation" mentioned.

Teams using a-b-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/a-b-testing/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/product/a-b-testing/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/a-b-testing/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How a-b-testing Compares

Feature / Agenta-b-testingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

The science of learning through controlled experimentation. A/B testing isn't about picking winners—it's about building a culture of validated learning and reducing the cost of being wrong. This skill covers experiment design, statistical rigor, feature flagging, analysis, and building experimentation into product development. The best experimenters know that every test, positive or negative, teaches something valuable. Use when "a/b test, experiment, hypothesis, statistical significance, sample size, feature flag, variant, control, treatment, p-value, conversion rate, test winner, split test, experimentation, testing, statistics, feature-flags, hypothesis, growth, optimization, learning, validation" mentioned.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# A B Testing

## Identity

You're an experimentation leader who has built testing cultures at high-velocity product
companies. You've seen teams ship disasters that would have been caught by simple tests,
and you've seen teams paralyzed by over-testing. You understand that experimentation is
about learning velocity, not about being right. You know the statistics deeply enough to
know when they matter and when practical judgment trumps p-values. You've built
experimentation platforms, designed thousands of experiments, and trained organizations
to make testing part of their DNA. You believe every feature is a hypothesis, every launch
is an experiment, and every failure is a lesson.


### Principles

- Every experiment must have a hypothesis before it starts
- Sample size isn't negotiable—underpowered tests are worse than no test
- Negative results are results—they save you from bad ideas
- Test one thing at a time or you learn nothing
- Statistical significance is necessary but not sufficient
- Practical significance matters more than p-values
- Trust the data even when it surprises you

## Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.

**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Related Skills

web-testing

16
from diegosouzapw/awesome-omni-skill

Playwright automation, Chrome DevTools debugging, and browser interaction testing. Use for E2E/unit tests, capturing screenshots, inspecting network/console logs, or validating user flows in web applications.

qa-testing-mobile

16
from diegosouzapw/awesome-omni-skill

Mobile app testing strategy and execution for iOS and Android (native + cross-platform): choose automation frameworks, define device matrix, control flakes, validate performance/reliability/accessibility, and set CI + release gates. Use when you need a mobile QA plan, device lab/CI setup, or guidance on XCUITest/Espresso/Appium/Detox/Maestro/Flutter testing.

anthropic-webapp-testing

16
from diegosouzapw/awesome-omni-skill

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

anthropic-web-testing

16
from diegosouzapw/awesome-omni-skill

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Frontend Verification & Testing

16
from diegosouzapw/awesome-omni-skill

Verify and test Angular 18 frontend changes using Chrome DevTools MCP. Automatically check console errors, network requests, and visual rendering after implementing tasks or when fixing UI bugs. Use when creating components, debugging visual issues, validating API integration, or ensuring UI requirements are met. File types: .ts, .html, .css, .scss

frontend-react-testing-strategy

16
from diegosouzapw/awesome-omni-skill

Standardized guidelines and patterns for Frontend React Testing Strategy.

angular-testing

16
from diegosouzapw/awesome-omni-skill

Write unit and integration tests for Angular v21+ applications using Vitest or Jasmine with TestBed, component harnesses, and modern testing patterns. Use for testing components with signals, OnPush change detection, services with inject(), and HTTP interactions. Triggers on test creation, testing signal-based components, mocking dependencies, or setting up test infrastructure.

testing-automation

16
from diegosouzapw/awesome-omni-skill

Automated testing workflow that orchestrates unit, integration, and E2E tests with CI/CD integration

continuous-testing

16
from diegosouzapw/awesome-omni-skill

Integrate automated testing into CI/CD pipelines for continuous quality feedback. Use for continuous testing, CI testing, automated testing pipelines, test orchestration, and DevOps quality practices.

cloud-penetration-testing

16
from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "perform cloud penetration testing", "assess Azure or AWS or GCP security", "enumerate cloud resources", "exploit cloud misconfiguratio...

bats-testing-patterns

16
from diegosouzapw/awesome-omni-skill

Master Bash Automated Testing System (Bats) for comprehensive shell script testing. Use when writing tests for shell scripts, CI/CD pipelines, or requiring test-driven development of shell utilities.

azure-microsoft-playwright-testing-ts

16
from diegosouzapw/awesome-omni-skill

Run Playwright tests at scale using Azure Playwright Workspaces (formerly Microsoft Playwright Testing). Use when scaling browser tests across cloud-hosted browsers, integrating with CI/CD pipeline...