ab-testing-ecommerce

Run controlled experiments on product pages, checkout flows, and pricing to find what converts best using statistical significance testing

11 stars

byfinsilabs

View on GitHub Installation ↓

Best use case

ab-testing-ecommerce is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Run controlled experiments on product pages, checkout flows, and pricing to find what converts best using statistical significance testing

Teams using ab-testing-ecommerce should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ab-testing-ecommerce/SKILL.md --create-dirs "https://raw.githubusercontent.com/finsilabs/awesome-ecommerce-skills/main/skills/data-analytics/ab-testing-ecommerce/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ab-testing-ecommerce/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ab-testing-ecommerce Compares

Feature / Agent	ab-testing-ecommerce	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Run controlled experiments on product pages, checkout flows, and pricing to find what converts best using statistical significance testing

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# A/B Testing for E-commerce

## Overview

A/B testing (split testing) runs controlled experiments where a random subset of visitors sees a variant while the rest see the control. Statistical analysis then determines whether any difference is real or due to chance. Good testing disciplines — calculating required sample size before starting, running tests for at least two full weeks, and never stopping early — separate genuine insights from noise.

This skill guides you through running A/B tests on your specific platform, choosing the right tools, and interpreting results correctly.

## When to Use This Skill

- When making product page, checkout, or pricing changes and wanting data-driven validation
- When migrating from a client-side A/B testing tool to server-side assignment for accuracy
- When needing statistical power calculations before starting an experiment
- When analyzing experiment results and determining when to ship or kill a variant
- When running a pricing test and needing to ensure consistent pricing per customer
- When wanting to understand what sample size is needed before a test is meaningful

## Core Instructions

### Step 1: Determine your platform and choose the right testing tool

| Platform | Recommended Tool | Why |
|----------|-----------------|-----|
| **Shopify** | Google Optimize (free, sunsetting) → **Convert.com** or **Intelligems** | Intelligems is built specifically for Shopify and supports pricing tests with sticky assignment; Convert integrates via Shopify's theme |
| **Shopify** (pricing tests) | **Intelligems** | The only tool that does true server-side price testing on Shopify without flickering |
| **WooCommerce** | **Nelio A/B Testing** plugin or **Google Optimize** | Nelio integrates natively with WordPress/WooCommerce; tracks WooCommerce conversion events automatically |
| **BigCommerce** | **Convert.com** or **VWO** (via script injection) | Both integrate via the BigCommerce storefront script manager |
| **Custom / Headless** | **LaunchDarkly** (feature flags + experiments) or build with **GrowthBook** (open source) | Server-side assignment with no flickering; GrowthBook is free and self-hostable |

### Step 2: Calculate required sample size before launching

Never launch a test without knowing how many visitors each variant needs. Running a test without a pre-determined stopping rule leads to peeking and false positives.

Use the free calculator at [https://www.evanmiller.org/ab-testing/sample-size.html](https://www.evanmiller.org/ab-testing/sample-size.html) or follow this guide:

- **Baseline conversion rate:** Pull your current CVR from your platform analytics (last 30 days)
- **Minimum detectable effect:** The smallest lift you care about detecting (typically 0.3–1 percentage point)
- **Statistical power:** 80% is standard
- **Significance level:** 95% confidence (alpha = 0.05)

**Example:** A Shopify store with 2.5% CVR wanting to detect a 0.3pp lift needs approximately 8,600 sessions per variant. At 500 sessions/day, that is 17 days per variant minimum.

Write down the required sample size before the test starts. This is your mandatory stopping rule.

### Step 3: Set up the experiment on your platform

---

#### Shopify

**Option A: Theme-based tests with Convert.com**

1. Install Convert.com and add the tracking script via **Online Store → Themes → Edit code → theme.liquid**
2. In Convert.com, go to **Experiences → Create Experience → A/B Test**
3. Use the visual editor to create your variant (change button color, headline, layout)
4. Set goals: **Add to Cart** or **Purchase** (Convert tracks Shopify purchase events automatically)
5. Set traffic allocation (50/50 for most tests)
6. Set the **minimum sample size** you calculated as the stopping condition

**Option B: Pricing tests with Intelligems**

1. Install **Intelligems** from the Shopify App Store
2. Go to **Intelligems → Price Tests → New Test**
3. Select the product(s) to test and set variant prices
4. Intelligems handles sticky assignment server-side — the same customer always sees the same price
5. Set the test duration to your pre-calculated sample size
6. Review results in Intelligems' dashboard: it shows revenue per visitor (not just CVR) as the primary metric

**For Shopify checkout tests (Shopify Plus only):**
- Use **Checkout Extensibility** or **Shopify Functions** to create checkout variants
- Shopify's built-in A/B testing via **Checkout profiles** is available on Plus

---

#### WooCommerce

**Using Nelio A/B Testing (recommended)**

1. Install **Nelio A/B Testing** from the WordPress plugin directory
2. Go to **Nelio A/B Testing → Add New Test**
3. Choose the test type:
   - **Page Test:** Test different landing page or product page variants
   - **WooCommerce Test:** Test product pricing, descriptions, or images
   - **Headline Test:** Test page titles or CTAs
4. Set your goal to **WooCommerce Order** (conversion event)
5. Nelio tracks statistical significance in real time — do not stop early just because significance is reached; wait for your pre-calculated sample size
6. View results at **Nelio A/B Testing → Results**

**Alternative: Google Optimize (free, requires Google Analytics 4)**
1. Create a Google Optimize account and link it to your GA4 property
2. Add the Optimize container ID to your WordPress site via **MonsterInsights** plugin (simplest method) or manually in the `<head>`
3. Create an A/B test in Optimize pointing to your WooCommerce product or checkout URLs
4. Set objectives using GA4 events (e.g., `purchase`)

---

#### BigCommerce

1. Go to **Storefront → Script Manager → Create a Script**
2. Add your A/B testing tool script (Convert.com, VWO, or Optimizely) with placement **Head** and **All pages**
3. In your testing tool, create an experiment targeting your BigCommerce product or category page URL
4. Set the conversion goal to track the order confirmation page URL (`/order-confirmation`)
5. BigCommerce also has built-in **Multivariate Testing** under **Marketing → Banner Manager** for banner-level tests (limited to visual banner content)

---

#### Custom / Headless

For headless storefronts, use server-side assignment to avoid flickering and to support pricing tests:

**Using GrowthBook (open source, recommended)**

1. Install GrowthBook: `npm install @growthbook/growthbook`
2. Initialize on the server side with your user ID for sticky assignment:

```typescript
import { GrowthBook } from "@growthbook/growthbook";

const gb = new GrowthBook({
  apiHost: "https://cdn.growthbook.io",
  clientKey: process.env.GROWTHBOOK_CLIENT_KEY,
  attributes: {
    id: userId, // stable user ID for consistent assignment
    loggedIn: !!customerId,
  },
});

await gb.loadFeatures();

// Assign variant — deterministic for the same userId
const checkoutButtonVariant = gb.getFeatureValue("checkout-button-color", "blue");
```

3. Track exposures and conversions back to GrowthBook:

```typescript
gb.setTrackingCallback((experiment, result) => {
  analytics.track("Experiment Viewed", {
    experimentId: experiment.key,
    variationId: result.key,
  });
});

// On order completion:
analytics.track("Purchase", { revenue: order.total });
```

4. View statistical results in the GrowthBook UI — it runs Bayesian or frequentist significance tests on your data

### Step 4: Interpret results correctly

When reviewing results:

1. **Wait for the pre-calculated sample size** — do not stop because it "looks significant"
2. **Check revenue per visitor, not just CVR** — a checkout test might increase CVR but decrease AOV; measure both
3. **Run for at least 2 full weeks** — day-of-week effects distort 7-day tests
4. **Look at guardrail metrics** — even if your primary metric improved, check return rates and customer service ticket volume

| Metric | What to Check |
|--------|--------------|
| Primary | Revenue per visitor (not CVR alone) |
| Guardrail | Return rate (variant should not increase returns) |
| Guardrail | Cart abandonment rate |
| Confidence | p < 0.05 AND minimum sample size reached |

## Best Practices

- **Calculate sample size before starting** — running until it "looks significant" is p-hacking; use the pre-calculated size as your stopping rule
- **Use server-side assignment for pricing tests** — client-side tools create flickering and can show different prices on page reload, which is a legal and UX risk
- **Never run more than 3–4 experiments on the same page simultaneously** — interaction effects between experiments contaminate all results
- **Exclude internal team traffic** — add your office IP to an exclusion list in your testing tool to prevent internal browsing from polluting results
- **Document the hypothesis before starting** — write down what you expect to happen and why; post-hoc hypothesis generation leads to confirmation bias
- **Run experiments for at least 2 full business weeks** — account for day-of-week and weekend shopping pattern differences

## Common Pitfalls

| Problem | Solution |
|---------|----------|
| Test ends early because it "looks significant" — then the lift disappears | Use pre-calculated sample size as a mandatory stopping rule; configure your testing tool to lock results until sample size is reached |
| Same user sees different variants on different sessions | Use server-side assignment keyed on a stable user ID (not session ID); Intelligems and GrowthBook handle this correctly by default |
| Checkout test shows lift in CVR but drop in AOV | Always measure revenue per visitor as your primary metric; CVR and AOV can move in opposite directions |
| Price flickering on Shopify pricing tests | Use Intelligems instead of client-side tools — it assigns prices server-side before the page renders |
| Novelty effect inflates variant results in the first week | Report results with and without the first 3 days of data; a large week-1 spike that fades is usually novelty |

## Related Skills

- @conversion-rate-optimization
- @product-analytics
- @customer-analytics
- @sales-reporting-dashboard
- @attribution-modeling

Related Skills

gdpr-ecommerce

from finsilabs/awesome-ecommerce-skills

Make your store GDPR-compliant with cookie consent, customer data export on request, right-to-deletion workflows, and data processing agreements

ab-testing-pricing

from finsilabs/awesome-ecommerce-skills

Test different price points with proper statistical rigor to find the revenue-maximizing price while tracking conversion rate and margin impact

google-ads-ecommerce

from finsilabs/awesome-ecommerce-skills

Build and optimize Google Ads campaigns for ecommerce with Performance Max, Shopping feeds, conversion tracking, and Smart Bidding strategies for ROAS

ecommerce-seo

from finsilabs/awesome-ecommerce-skills

Maximize organic search traffic with optimized product page meta tags, JSON-LD structured data for Google Shopping, and automated XML sitemaps

load-testing-commerce

from finsilabs/awesome-ecommerce-skills

Simulate realistic shopper traffic on your checkout and catalog pages using k6 or Artillery to find performance bottlenecks before launch

ecommerce-caching

from finsilabs/awesome-ecommerce-skills

Improve store performance with a layered caching strategy — CDN edge caching, Redis application cache, and smart cart-aware invalidation

ecommerce-data-warehouse

from finsilabs/awesome-ecommerce-skills

Build a commerce data warehouse with star-schema tables, ETL pipelines, and dbt models for BigQuery, Snowflake, or Redshift analytics

ecommerce-budgeting-forecasting

from finsilabs/awesome-ecommerce-skills

Build rolling operating budgets for marketing spend, inventory purchases, and operations with variance analysis, scenario modeling, and budget utilization alerts

wishlist-save-for-later

from finsilabs/awesome-ecommerce-skills

Let shoppers save products to a wishlist, share it with friends, and get notified when saved items come back in stock or drop in price

storefront-theming

from finsilabs/awesome-ecommerce-skills

Build a themeable storefront with design tokens and CSS custom properties that supports white-labeling, multi-brand variants, and dark mode

search-autocomplete

from finsilabs/awesome-ecommerce-skills

Speed up product discovery with instant search suggestions, fuzzy typo matching, and category-aware results powered by Algolia or Elasticsearch

responsive-storefront

from finsilabs/awesome-ecommerce-skills

Build a mobile-first storefront with thumb-friendly navigation, sticky add-to-cart buttons, and touch-optimized components for high mobile conversion