customerio-load-scale
Implement Customer.io load testing and horizontal scaling. Use when preparing for high traffic, running load tests, or designing queue-based architectures for scale. Trigger: "customer.io load test", "customer.io scale", "customer.io high volume", "customer.io k6", "customer.io performance test".
Best use case
customerio-load-scale is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Implement Customer.io load testing and horizontal scaling. Use when preparing for high traffic, running load tests, or designing queue-based architectures for scale. Trigger: "customer.io load test", "customer.io scale", "customer.io high volume", "customer.io k6", "customer.io performance test".
Teams using customerio-load-scale should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/customerio-load-scale/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How customerio-load-scale Compares
| Feature / Agent | customerio-load-scale | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Implement Customer.io load testing and horizontal scaling. Use when preparing for high traffic, running load tests, or designing queue-based architectures for scale. Trigger: "customer.io load test", "customer.io scale", "customer.io high volume", "customer.io k6", "customer.io performance test".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
SKILL.md Source
# Customer.io Load & Scale
## Overview
Load testing and scaling strategies for high-volume Customer.io integrations: k6 load test scripts, scaling architecture selection based on volume tier, Kubernetes HPA autoscaling, message queue buffering, and rate-limit-aware batch processing.
## Scaling Architecture by Volume
| Daily Events | Architecture | Key Components |
|-------------|--------------|----------------|
| < 100K | Direct API | Singleton client, retry, connection pooling |
| 100K - 1M | Batched API | Event queue, batch processor, rate limiter |
| 1M - 10M | Queue-backed | Redis/Kafka queue, worker pool, backpressure |
| > 10M | Distributed | Multiple workspaces, sharded queues, regional routing |
Customer.io rate limit is ~100 req/sec per workspace. Plan your architecture around this.
## Instructions
### Step 1: k6 Load Test Script
```javascript
// load-tests/customerio.js
// Run: k6 run --vus 10 --duration 60s load-tests/customerio.js
import http from "k6/http";
import { check, sleep } from "k6";
import { Counter, Trend } from "k6/metrics";
const SITE_ID = __ENV.CUSTOMERIO_SITE_ID;
const API_KEY = __ENV.CUSTOMERIO_TRACK_API_KEY;
const BASE_URL = "https://track.customer.io/api/v1";
const AUTH = `${SITE_ID}:${API_KEY}`;
const identifyLatency = new Trend("cio_identify_latency");
const trackLatency = new Trend("cio_track_latency");
const errors = new Counter("cio_errors");
export const options = {
scenarios: {
identify_load: {
executor: "ramping-arrival-rate",
startRate: 10,
timeUnit: "1s",
preAllocatedVUs: 20,
maxVUs: 50,
stages: [
{ duration: "30s", target: 50 }, // Ramp to 50/sec
{ duration: "60s", target: 80 }, // Hold at 80/sec (near limit)
{ duration: "30s", target: 10 }, // Cool down
],
},
},
thresholds: {
cio_identify_latency: ["p(95)<500", "p(99)<2000"],
cio_track_latency: ["p(95)<500", "p(99)<2000"],
cio_errors: ["count<50"],
},
};
export default function () {
const userId = `k6-load-${__VU}-${__ITER}`;
const headers = {
"Content-Type": "application/json",
Authorization: `Basic ${encoding.b64encode(AUTH)}`,
};
// Identify
const identifyRes = http.put(
`${BASE_URL}/customers/${userId}`,
JSON.stringify({
email: `${userId}@loadtest.example.com`,
_load_test: true,
created_at: Math.floor(Date.now() / 1000),
}),
{ headers }
);
identifyLatency.add(identifyRes.timings.duration);
check(identifyRes, { "identify 200": (r) => r.status === 200 }) || errors.add(1);
// Track event
const trackRes = http.post(
`${BASE_URL}/customers/${userId}/events`,
JSON.stringify({
name: "load_test_event",
data: { iteration: __ITER, vu: __VU },
}),
{ headers }
);
trackLatency.add(trackRes.timings.duration);
check(trackRes, { "track 200": (r) => r.status === 200 }) || errors.add(1);
sleep(0.1); // Small delay between iterations
}
// Cleanup function — suppress test users after test
export function teardown() {
console.log("Load test complete. Clean up k6-load-* users in CIO dashboard.");
}
```
Run:
```bash
k6 run --env CUSTOMERIO_SITE_ID="$CUSTOMERIO_SITE_ID" \
--env CUSTOMERIO_TRACK_API_KEY="$CUSTOMERIO_TRACK_API_KEY" \
load-tests/customerio.js
```
### Step 2: Queue-Based Architecture
```typescript
// services/cio-queue-worker.ts
import { Queue, Worker, QueueEvents } from "bullmq";
import { TrackClient, RegionUS } from "customerio-node";
import Bottleneck from "bottleneck";
const REDIS_URL = process.env.REDIS_URL ?? "redis://localhost:6379";
// Rate limiter: 80 requests per second (leave headroom under 100/sec limit)
const limiter = new Bottleneck({
maxConcurrent: 15,
reservoir: 80,
reservoirRefreshAmount: 80,
reservoirRefreshInterval: 1000,
});
const eventQueue = new Queue("cio:events", {
connection: { url: REDIS_URL },
defaultJobOptions: {
attempts: 5,
backoff: { type: "exponential", delay: 2000 },
removeOnComplete: { count: 10000 },
removeOnFail: { count: 50000 },
},
});
// Producer — your application enqueues events here
export async function enqueueEvent(
type: "identify" | "track",
userId: string,
data: Record<string, any>
): Promise<void> {
await eventQueue.add(type, { userId, data, enqueuedAt: Date.now() });
}
// Consumer — workers process events with rate limiting
export function startEventWorkers(concurrency = 10): void {
const cio = new TrackClient(
process.env.CUSTOMERIO_SITE_ID!,
process.env.CUSTOMERIO_TRACK_API_KEY!,
{ region: RegionUS }
);
const worker = new Worker(
"cio:events",
async (job) => {
await limiter.schedule(async () => {
if (job.name === "identify") {
await cio.identify(job.data.userId, job.data.data);
} else {
await cio.track(job.data.userId, job.data.data);
}
});
},
{
connection: { url: REDIS_URL },
concurrency,
}
);
worker.on("failed", (job, err) => {
console.error(`CIO event failed: ${job?.id} — ${err.message}`);
});
// Monitor queue health
const events = new QueueEvents("cio:events", {
connection: { url: REDIS_URL },
});
setInterval(async () => {
const counts = await eventQueue.getJobCounts();
console.log(
`CIO queue: waiting=${counts.waiting} active=${counts.active} ` +
`failed=${counts.failed} completed=${counts.completed}`
);
}, 30000);
}
```
### Step 3: Kubernetes HPA Autoscaling
```yaml
# k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: cio-worker-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: cio-event-worker
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: cio_queue_depth
target:
type: AverageValue
averageValue: "500"
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 4
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 2
periodSeconds: 120
```
### Step 4: Batch Sender for Bulk Operations
```typescript
// lib/cio-batch-sender.ts
import { TrackClient, RegionUS } from "customerio-node";
import Bottleneck from "bottleneck";
export async function batchSend(
operations: Array<{
type: "identify" | "track";
userId: string;
data: Record<string, any>;
}>,
ratePerSec = 80
): Promise<{ succeeded: number; failed: number }> {
const cio = new TrackClient(
process.env.CUSTOMERIO_SITE_ID!,
process.env.CUSTOMERIO_TRACK_API_KEY!,
{ region: RegionUS }
);
const limiter = new Bottleneck({
maxConcurrent: 15,
reservoir: ratePerSec,
reservoirRefreshAmount: ratePerSec,
reservoirRefreshInterval: 1000,
});
let succeeded = 0;
let failed = 0;
const promises = operations.map((op, i) =>
limiter.schedule(async () => {
try {
if (op.type === "identify") {
await cio.identify(op.userId, op.data);
} else {
await cio.track(op.userId, op.data);
}
succeeded++;
} catch {
failed++;
}
if ((succeeded + failed) % 1000 === 0) {
console.log(`Progress: ${succeeded + failed}/${operations.length}`);
}
})
);
await Promise.all(promises);
return { succeeded, failed };
}
```
Install: `npm install bottleneck bullmq`
## Load Test Checklist
- [ ] Test against staging workspace (NEVER production)
- [ ] Start at 10% of target rate, ramp up gradually
- [ ] Monitor 429 error rate during test
- [ ] Check Customer.io dashboard for processing lag
- [ ] Verify cleanup of test users after load test
- [ ] Document baseline latency and throughput numbers
- [ ] Set up alerts before running at production scale
## Error Handling
| Issue | Solution |
|-------|----------|
| 429 during load test | Reduce rate, check limiter config |
| Queue backlog growing | Scale workers, increase concurrency |
| Memory pressure | Limit batch and queue sizes, enable GC |
| k6 VU exhaustion | Increase `preAllocatedVUs` and `maxVUs` |
## Resources
- [k6 Documentation](https://k6.io/docs/)
- [Bottleneck npm](https://www.npmjs.com/package/bottleneck)
- [BullMQ Documentation](https://bullmq.io/)
- [Kubernetes HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
## Next Steps
After load testing, proceed to `customerio-known-pitfalls` for anti-patterns to avoid.Related Skills
testing-load-balancers
Validate load balancer behavior, failover, and traffic distribution. Use when performing specialized testing. Trigger with phrases like "test load balancer", "validate failover", or "check traffic distribution".
windsurf-load-scale
Scale Windsurf adoption across large organizations with workspace strategies and performance tuning. Use when rolling out Windsurf to 50+ developers, managing large monorepo workspaces, or planning enterprise-scale deployment. Trigger with phrases like "windsurf at scale", "windsurf large team", "windsurf monorepo", "windsurf organization", "windsurf 100 developers".
vercel-load-scale
Load test and scale Vercel deployments with concurrency tuning and capacity planning. Use when running performance tests, planning for traffic spikes, or optimizing serverless function scaling on Vercel. Trigger with phrases like "vercel load test", "vercel scale", "vercel performance test", "vercel capacity", "vercel benchmark".
supabase-load-scale
Scale Supabase projects for production load: read replicas, connection pooling tuning via Supavisor, compute size upgrades, CDN caching for Storage, Edge Function regional deployment, and database table partitioning. Use when preparing for traffic spikes, optimizing connection limits, setting up read replicas for analytics queries, or partitioning large tables. Trigger with phrases like "supabase scale", "supabase read replica", "supabase connection pooling", "supabase compute upgrade", "supabase CDN storage", "supabase edge function regions", "supabase partitioning", "supavisor", "supabase pool mode".
snowflake-load-scale
Implement Snowflake load testing, warehouse scaling, and capacity planning. Use when testing query performance at scale, configuring multi-cluster warehouses, or planning capacity for production Snowflake workloads. Trigger with phrases like "snowflake load test", "snowflake scale", "snowflake capacity", "snowflake benchmark", "snowflake multi-cluster".
shopify-load-scale
Load test Shopify integrations respecting API rate limits, plan capacity with k6, and scale for Shopify Plus burst events (flash sales, BFCM). Trigger with phrases like "shopify load test", "shopify scale", "shopify BFCM", "shopify flash sale", "shopify capacity", "shopify k6 test".
sentry-load-scale
Scale Sentry for high-traffic applications handling millions of events per day. Use when optimizing SDK performance at high volume, implementing adaptive sampling, managing quotas and costs at scale, or deploying Sentry across multi-region infrastructure. Trigger with phrases like "sentry high traffic", "scale sentry", "sentry millions events", "sentry high volume", "sentry quota management", "sentry load test".
salesforce-load-scale
Implement Salesforce load testing, API limit capacity planning, and Bulk API scaling. Use when running performance tests against Salesforce, planning API consumption, or scaling high-volume Salesforce integrations. Trigger with phrases like "salesforce load test", "salesforce scale", "salesforce performance test", "salesforce capacity planning", "salesforce high volume".
retellai-load-scale
Retell AI load scale — AI voice agent and phone call automation. Use when working with Retell AI for voice agents, phone calls, or telephony. Trigger with phrases like "retell load scale", "retellai-load-scale", "voice agent".
replit-load-scale
Load test and scale Replit deployments with Autoscale tuning, Reserved VM sizing, and capacity planning. Use when load testing Replit apps, optimizing Autoscale behavior, or planning capacity for production traffic. Trigger with phrases like "replit load test", "replit scale", "replit capacity", "replit performance test", "replit autoscale tuning".
perplexity-load-scale
Load test Perplexity Sonar API integrations and plan capacity. Use when running performance tests, planning for traffic growth, or benchmarking Perplexity latency under load. Trigger with phrases like "perplexity load test", "perplexity scale", "perplexity performance test", "perplexity capacity", "perplexity benchmark".
openrouter-load-balancing
Distribute OpenRouter requests across multiple keys and models for high throughput. Use when scaling beyond single-key rate limits or building high-availability systems. Triggers: 'openrouter load balance', 'openrouter scaling', 'distribute openrouter requests', 'multiple api keys'.