ab-testing
Set up and evaluate A/B tests using feature-flag-server. Use when you need to run a multivariate experiment, assign users to variants, track which variant they received, or analyze variant distribution. Triggers include "A/B test", "split test", "experiment", "variant", "multivariate", "rollout percentage", or any task involving exposing different experiences to different users.
Best use case
ab-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Set up and evaluate A/B tests using feature-flag-server. Use when you need to run a multivariate experiment, assign users to variants, track which variant they received, or analyze variant distribution. Triggers include "A/B test", "split test", "experiment", "variant", "multivariate", "rollout percentage", or any task involving exposing different experiences to different users.
Teams using ab-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ab-testing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How ab-testing Compares
| Feature / Agent | ab-testing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Set up and evaluate A/B tests using feature-flag-server. Use when you need to run a multivariate experiment, assign users to variants, track which variant they received, or analyze variant distribution. Triggers include "A/B test", "split test", "experiment", "variant", "multivariate", "rollout percentage", or any task involving exposing different experiences to different users.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# ab-testing
Run A/B tests and multivariate experiments using feature-flag-server segment flags.
## When to use
- Running a two-variant A/B test (control vs treatment)
- Multivariate tests with more than two variants
- Gradual feature rollout with percentage-based targeting
- Segment-targeted experiments (e.g., only beta users see the variant)
## Flag types for experiments
| Type | Use case |
|---|---|
| `percentage` | Random percentage split (e.g., 50% see the feature) |
| `segment` | Target specific user groups with different variants |
For A/B tests, use a `segment` flag with weighted variants. This gives full control over the variant split.
## Creating an A/B test via REST API
```bash
# Step 1: Create the flag
curl -s -X POST http://localhost:7777/api/flags \
-H "Authorization: Bearer ffs_..." \
-H "Content-Type: application/json" \
-d '{
"key": "button-color-test",
"name": "Button Color A/B Test",
"flagType": "segment"
}'
# Response: { "id": "550e8400-e29b-...", "key": "button-color-test", ... }
# Step 2: Configure variants and weights for production
curl -s -X PATCH http://localhost:7777/api/flags/550e8400-e29b-.../environments/production \
-H "Authorization: Bearer ffs_..." \
-H "Content-Type: application/json" \
-d '{
"enabled": 1,
"variants": [
{ "key": "control", "value": "#3B82F6", "weight": 50 },
{ "key": "variant-a", "value": "#10B981", "weight": 30 },
{ "key": "variant-b", "value": "#F59E0B", "weight": 20 }
]
}'
```
Note: variant weights must sum to 100.
## Evaluating a variant (SDK)
```typescript
import { FlagClient } from '@ffs/js-sdk';
const client = new FlagClient({
baseUrl: 'http://localhost:7777',
apiKey: process.env.FFS_API_KEY!,
environment: 'production',
});
const result = await client.getVariant('button-color-test', {
userId: req.user.id,
plan: req.user.plan,
country: req.user.country,
});
// result.value => "#10B981" (the variant's value)
// result.variant => "variant-a" (the variant key)
// result.reason => "segment_match"
// Use the value in your UI
const buttonStyle = { backgroundColor: result.value };
```
## Checking variant distribution via API
```bash
curl -s http://localhost:7777/api/analytics/flags/550e8400-e29b-.../variants \
-H "Authorization: Bearer ffs_..."
# Response:
# {
# "total": 111432,
# "variants": [
# { "key": "control", "count": 55716, "pct": 50.0 },
# { "key": "variant-a", "count": 33430, "pct": 30.0 },
# { "key": "variant-b", "count": 22286, "pct": 20.0 }
# ],
# "periodDays": 7
# }
```
## Percentage rollout (gradual launch)
For simple gradual rollouts without variants, use a `percentage` flag:
```bash
# Create percentage flag, start at 0%
curl -s -X POST http://localhost:7777/api/flags \
-H "Authorization: Bearer ffs_..." \
-H "Content-Type: application/json" \
-d '{ "key": "feature-x-rollout", "name": "Feature X Rollout", "flagType": "percentage" }'
# Enable at 10% in production
curl -s -X PATCH http://localhost:7777/api/flags/<id>/environments/production \
-H "Authorization: Bearer ffs_..." \
-H "Content-Type: application/json" \
-d '{ "enabled": 1, "rolloutPct": 10 }'
# Ramp to 50%
curl -s -X PATCH http://localhost:7777/api/flags/<id>/environments/production \
-H "Authorization: Bearer ffs_..." \
-H "Content-Type: application/json" \
-d '{ "rolloutPct": 50 }'
```
Evaluating a percentage flag returns `true` if the user is within the rollout, `false` otherwise. The assignment is deterministic and stable for each user.
## Segment-targeted experiments
Target a variant only at specific users by combining a segment with a flag:
1. Create a segment: `POST /api/segments` with rules (e.g., `user.plan eq "enterprise"`)
2. Create a segment flag
3. Add a rule that assigns `variant-a` to users matching the segment
4. Set a default variant (returned for users who do not match any rule)
```json
{
"rules": [
{
"segmentId": "segment-uuid-here",
"variant": "variant-a"
}
],
"defaultVariant": "control"
}
```
## Behavior
- **Determinism:** The same `userId` + `flagKey` combination always produces the same variant. Changing the flag weights may re-assign some users.
- **Context isolation:** Context values are only used for segment rule matching. They are not stored or logged in full.
- **Audit trail:** Every variant weight or rule change is recorded in the audit log with a diff.Related Skills
Skill: Uptime Monitoring
## Overview
Skill: Status Page
## Overview
Skill: unit-conversion
## Overview
Skill: recipe-scaler
## Overview
reading-list
Operate the reading-list API to save, manage, tag, search, and export articles.
email-digest
Configure, test, and troubleshoot the reading-list daily email digest delivered via nodemailer.
websocket-realtime
Use the WebSocket connection in poll-builder to receive live vote updates. Use when you need to stream real-time poll results, monitor a poll for new votes, or build a live dashboard. Triggers include "live results", "real-time updates", "stream votes", "watch poll", or "WebSocket".
poll-builder
Self-hosted poll creation tool with real-time results. Use when you need to create a poll, check vote counts, close a poll, export results, or get the shareable link for a poll. Triggers include "create poll", "vote", "poll results", "survey", "collect votes", "share poll", or any task involving polling or voting.
Skill: personal-finance
## Overview
Skill: csv-import
## Overview
Skill: Syntax Highlighting
## Purpose
Skill: Pastebin Core
## Purpose