sentry-load-scale
Scale Sentry for high-traffic applications handling millions of events per day. Use when optimizing SDK performance at high volume, implementing adaptive sampling, managing quotas and costs at scale, or deploying Sentry across multi-region infrastructure. Trigger with phrases like "sentry high traffic", "scale sentry", "sentry millions events", "sentry high volume", "sentry quota management", "sentry load test".
Best use case
sentry-load-scale is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Scale Sentry for high-traffic applications handling millions of events per day. Use when optimizing SDK performance at high volume, implementing adaptive sampling, managing quotas and costs at scale, or deploying Sentry across multi-region infrastructure. Trigger with phrases like "sentry high traffic", "scale sentry", "sentry millions events", "sentry high volume", "sentry quota management", "sentry load test".
Teams using sentry-load-scale should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/sentry-load-scale/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How sentry-load-scale Compares
| Feature / Agent | sentry-load-scale | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Scale Sentry for high-traffic applications handling millions of events per day. Use when optimizing SDK performance at high volume, implementing adaptive sampling, managing quotas and costs at scale, or deploying Sentry across multi-region infrastructure. Trigger with phrases like "sentry high traffic", "scale sentry", "sentry millions events", "sentry high volume", "sentry quota management", "sentry load test".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Sentry Load & Scale
Configure Sentry for applications processing 1M+ requests/day without sacrificing error visibility, burning through quota, or adding measurable SDK overhead. Covers adaptive sampling, connection pooling, multi-region tagging, quota management, SDK benchmarking, batch submission, load testing, and self-hosted deployment considerations.
## Prerequisites
- Application handling sustained high traffic (>10K requests/min or >1M events/day)
- Sentry organization with quota and billing access (Settings > Subscription)
- `@sentry/node` v8+ installed (`npm ls @sentry/node`)
- Performance baseline established (p50/p95/p99 latency without Sentry)
- Event volume estimates calculated per category (errors, transactions, replays, attachments)
## Instructions
### Step 1 — Implement Adaptive Sampling
Static `tracesSampleRate` wastes quota at scale because it treats a health check the same as a checkout. Replace it with a traffic-aware `tracesSampler` that adjusts rates based on endpoint criticality and current load.
**Traffic-aware tracesSampler:**
```typescript
import * as Sentry from '@sentry/node';
// Track request volume per endpoint for adaptive rate adjustment
const endpointVolume = new Map<string, { count: number; resetAt: number }>();
const WINDOW_MS = 60_000;
function getAdaptiveRate(name: string, baseRate: number): number {
const now = Date.now();
let entry = endpointVolume.get(name);
if (!entry || now > entry.resetAt) {
entry = { count: 0, resetAt: now + WINDOW_MS };
endpointVolume.set(name, entry);
}
entry.count++;
// Scale down sampling as volume increases within window
// 0-100 req/min: full base rate
// 100-1000: halve it
// 1000+: quarter it
if (entry.count > 1000) return baseRate * 0.25;
if (entry.count > 100) return baseRate * 0.5;
return baseRate;
}
Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampler: (samplingContext) => {
const { name, parentSampled } = samplingContext;
// Always respect parent decision for distributed tracing consistency
if (parentSampled !== undefined) return parentSampled ? 1.0 : 0;
// Tier 0: Never sample — high-frequency, zero diagnostic value
if (name?.match(/\/(health|ready|alive|ping|metrics|favicon)/)) return 0;
if (name?.match(/\.(css|js|png|jpg|svg|woff2?|ico)$/)) return 0;
// Tier 1: Always sample — business-critical, low volume
if (name?.includes('/payment') || name?.includes('/checkout')) return 1.0;
if (name?.includes('/auth/login')) return getAdaptiveRate('auth', 0.5);
// Tier 2: Moderate sampling — API mutations (higher signal)
if (name?.startsWith('POST /api/')) return getAdaptiveRate(name, 0.05);
if (name?.startsWith('PUT /api/')) return getAdaptiveRate(name, 0.05);
if (name?.startsWith('DELETE /api/')) return getAdaptiveRate(name, 0.05);
// Tier 3: Light sampling — API reads
if (name?.startsWith('GET /api/')) return getAdaptiveRate(name, 0.02);
// Tier 4: Background jobs — sample sparingly
if (name?.startsWith('job:') || name?.startsWith('queue:')) {
return getAdaptiveRate(name, 0.01);
}
// Tier 5: Everything else — minimal baseline
return getAdaptiveRate(name || 'default', 0.005);
},
});
```
**Adaptive error deduplication with `beforeSend`:**
```typescript
// Reduce duplicate error volume by 90%+ while preserving first-occurrence fidelity
const errorCounts = new Map<string, number>();
const ERROR_WINDOW_MS = 60_000;
setInterval(() => errorCounts.clear(), ERROR_WINDOW_MS);
Sentry.init({
dsn: process.env.SENTRY_DSN,
beforeSend(event, hint) {
const error = hint?.originalException;
const key = error instanceof Error
? `${error.name}:${error.message?.substring(0, 100)}`
: `unknown:${String(event.message || '').substring(0, 100)}`;
const count = (errorCounts.get(key) || 0) + 1;
errorCounts.set(key, count);
// First occurrence: always send with full context
if (count === 1) return event;
// 2-10: send every 5th (capture ramp-up pattern)
if (count <= 10) return count % 5 === 0 ? event : null;
// 11-100: send every 25th (confirm still happening)
if (count <= 100) return count % 25 === 0 ? event : null;
// 100+: send every 100th (volume indicator only)
return count % 100 === 0 ? event : null;
},
});
```
### Step 2 — Optimize SDK for Minimal Overhead
At high throughput, every byte and every millisecond of SDK processing matters. This configuration reduces memory footprint, payload size, and CPU time.
**Lean SDK initialization:**
```typescript
import * as Sentry from '@sentry/node';
import os from 'node:os';
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV || 'production',
release: `${process.env.SERVICE_NAME}@${process.env.VERSION || 'unknown'}`,
// --- Memory reduction ---
maxBreadcrumbs: 15, // Down from 100 default; saves ~85KB/scope
maxValueLength: 200, // Truncate long string values
// --- Disable high-overhead integrations ---
integrations: (defaults) => defaults.filter(i =>
!['Console', 'ContextLines'].includes(i.name)
),
// --- No profiling at high scale (use dedicated APM if needed) ---
profilesSampleRate: 0,
// --- Transport tuning for high-throughput ---
transportOptions: {
bufferSize: 100, // Default 64; absorbs traffic spikes
},
// --- Context size limiter ---
beforeSend(event) {
// Truncate oversized contexts to prevent payload bloat
if (event.contexts) {
for (const [key, ctx] of Object.entries(event.contexts)) {
const str = JSON.stringify(ctx);
if (str.length > 2000) {
event.contexts[key] = { _truncated: true, originalSize: str.length };
}
}
}
// Strip headers that add bulk without diagnostic value
if (event.request?.headers) {
const keep = ['content-type', 'accept', 'user-agent', 'x-request-id'];
event.request.headers = Object.fromEntries(
Object.entries(event.request.headers)
.filter(([k]) => keep.includes(k.toLowerCase()))
);
}
return event;
},
// --- Multi-region tags for infrastructure visibility ---
serverName: process.env.HOSTNAME || process.env.POD_NAME || os.hostname(),
initialScope: {
tags: {
region: process.env.AWS_REGION || process.env.GCP_REGION || 'unknown',
cluster: process.env.K8S_CLUSTER || 'default',
pod: process.env.POD_NAME || 'unknown',
service: process.env.SERVICE_NAME || 'unknown',
},
},
});
```
**Graceful shutdown ensuring event delivery:**
```typescript
import * as Sentry from '@sentry/node';
async function shutdown(signal: string) {
console.log(`${signal} received — flushing Sentry events`);
// Stop accepting new requests
server.close();
// Flush all pending events (2s timeout prevents hanging deploys)
const flushed = await Sentry.close(2000);
if (!flushed) {
console.warn('Sentry flush timed out — some events may be lost');
}
process.exit(0);
}
process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));
```
### Step 3 — Manage Quotas, Test Under Load, and Plan for Scale
**Quota management and reserved volume pricing:**
```
Application: 10M requests/day, 0.1% error rate, @sentry/node v8
Error events (with adaptive beforeSend):
Raw errors: 10M x 0.001 = 10,000/day
After dedup: ~1,000/day (90% reduction) = 30K/month
Transaction events (with tiered tracesSampler):
Health/static: 0% of 4M = 0
Payment (T1): 100% of 5K = 5,000/day
POST API (T2): 5% of 500K = 25,000/day
GET API (T3): 2% of 5M = 100,000/day
Other (T5): 0.5% of 500K = 2,500/day
Total: ~132K/day = 4M/month
Sentry Business plan ($26/mo base):
Errors: 30K included in base plan
Transactions: 100K included, overage 3.9M x $0.000025 = ~$97/mo
Estimated total: ~$123/month for 10M requests/day
Reserved volume (if predictable traffic):
5M txns/mo reserved = $80/mo (vs $97 on-demand)
Saves ~$17/mo, locks in price for 12 months
→ Total: ~$106/month
```
**SDK overhead benchmarks:**
```typescript
// Measure SDK initialization cost
const initStart = performance.now();
Sentry.init({ /* ... */ });
const initMs = performance.now() - initStart;
console.log(`Sentry.init: ${initMs.toFixed(1)}ms`);
// Expected: 5-15ms (Node.js), acceptable <50ms
// Measure per-request overhead with Sentry vs without
import { performance, PerformanceObserver } from 'node:perf_hooks';
async function benchmarkOverhead(iterations: number = 1000) {
// Baseline: request without Sentry instrumentation
const baseStart = performance.now();
for (let i = 0; i < iterations; i++) {
await handleRequest({ path: '/api/test', method: 'GET' });
}
const baseMs = (performance.now() - baseStart) / iterations;
// Instrumented: request with Sentry span
const sentryStart = performance.now();
for (let i = 0; i < iterations; i++) {
await Sentry.startSpan(
{ name: 'GET /api/test', op: 'http.server' },
() => handleRequest({ path: '/api/test', method: 'GET' })
);
}
const sentryMs = (performance.now() - sentryStart) / iterations;
console.log(`Baseline: ${baseMs.toFixed(3)}ms/req`);
console.log(`With Sentry: ${sentryMs.toFixed(3)}ms/req`);
console.log(`Overhead: ${(sentryMs - baseMs).toFixed(3)}ms (${(((sentryMs - baseMs) / baseMs) * 100).toFixed(1)}%)`);
// Healthy: <0.5ms overhead per request, <2% CPU impact
}
```
**Load testing Sentry integration with k6:**
```javascript
// k6-sentry-load-test.js
// Run: k6 run --vus 100 --duration 5m k6-sentry-load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
const errorRate = new Rate('sentry_errors_captured');
const latencyOverhead = new Trend('sentry_latency_overhead_ms');
export const options = {
stages: [
{ duration: '1m', target: 50 }, // Ramp up
{ duration: '3m', target: 200 }, // Sustained load
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // p95 under 500ms with Sentry
sentry_latency_overhead_ms: ['p(95)<5'], // Sentry adds <5ms at p95
},
};
const BASE_URL = __ENV.BASE_URL || 'http://localhost:3000';
export default function () {
// Normal traffic: API reads (high volume, low sample rate)
const readRes = http.get(`${BASE_URL}/api/products`);
check(readRes, { 'GET 200': (r) => r.status === 200 });
// Track overhead via server timing header (if exposed)
const sentryMs = readRes.headers['Server-Timing']?.match(/sentry;dur=(\d+\.?\d*)/);
if (sentryMs) latencyOverhead.add(parseFloat(sentryMs[1]));
// Occasional writes (lower volume, higher sample rate)
if (Math.random() < 0.1) {
const writeRes = http.post(`${BASE_URL}/api/orders`, JSON.stringify({
items: [{ sku: 'TEST-001', qty: 1 }],
}), { headers: { 'Content-Type': 'application/json' } });
check(writeRes, { 'POST 201': (r) => r.status === 201 });
}
// Trigger errors (verify Sentry captures under load)
if (Math.random() < 0.01) {
const errRes = http.get(`${BASE_URL}/api/nonexistent-route`);
errorRate.add(errRes.status === 404);
}
sleep(0.1);
}
```
**Background worker batch patterns:**
```typescript
import * as Sentry from '@sentry/node';
// For queue workers processing millions of jobs/day
async function processJobBatch(jobs: Job[]) {
// Group jobs for batch-level tracing instead of per-job spans
return Sentry.startSpan(
{
name: `batch.${jobs[0]?.type || 'unknown'}`,
op: 'queue.batch',
attributes: { 'batch.size': jobs.length },
},
async () => {
const results = { success: 0, failed: 0 };
for (const job of jobs) {
try {
await Sentry.withScope(async (scope) => {
scope.setTag('job.type', job.type);
scope.setTag('job.queue', job.queue);
scope.setContext('job', {
id: job.id,
attempts: job.attempts,
});
await executeJob(job);
results.success++;
});
} catch (error) {
results.failed++;
Sentry.captureException(error, {
tags: { 'job.id': job.id, 'job.type': job.type },
level: job.attempts >= 3 ? 'error' : 'warning',
});
}
}
Sentry.setMeasurement('batch.success_rate',
results.success / jobs.length, 'ratio');
return results;
}
);
}
// Periodic flush for long-running workers (don't rely on process exit)
setInterval(async () => {
await Sentry.flush(2000);
}, 30_000);
```
**Self-hosted Sentry for enterprise (>100M events/month):**
Key tuning for self-hosted (`docker-compose.override.yml` on top of [getsentry/self-hosted](https://github.com/getsentry/self-hosted)):
- Relay: `RELAY_PROCESSING_MAX_RATE: 50000`, `RELAY_UPSTREAM_MAX_CONNECTIONS: 200`
- Kafka: `KAFKA_NUM_PARTITIONS: 32` (match to consumer count)
- Snuba: 4+ consumer replicas for Clickhouse ingestion parallelism
- Clickhouse: 16G+ RAM, dedicated SSD volumes
```
Self-hosted vs SaaS break-even:
SaaS at 100M events/month: ~$2,500/mo (Business plan + overage)
Self-hosted (3x r6g.2xlarge): ~$1,200/mo infra + $800/mo ops (0.25 FTE)
Break-even: ~50M events/month
→ Use SaaS up to 50M events; evaluate self-hosted above that
```
## Output
- Adaptive sampling reducing duplicate error volume by 90%+ while preserving first-occurrence fidelity
- Traffic-aware `tracesSampler` with 5 tiers adjusting dynamically based on endpoint volume
- SDK memory and CPU footprint minimized (15 breadcrumbs, truncated contexts, filtered headers)
- Connection pooling via persistent HTTPS agent for efficient event submission
- Multi-region infrastructure tags for filtering by region/cluster/pod in Sentry dashboard
- Cost model with reserved volume pricing showing $106/month for 10M requests/day
- k6 load test script validating Sentry overhead stays under 5ms at p95
- Batch job processing pattern with scope isolation and periodic flush
- Self-hosted vs SaaS break-even analysis for enterprise decision-making
## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| Events silently dropped | SDK buffer full during traffic spike | Increase `transportOptions.bufferSize` to 200+, verify network to Sentry ingest |
| 429 rate limit from Sentry | Quota exhausted or spike protection triggered | Enable spike protection in Settings > Subscription, reduce sample rates |
| Memory growing linearly over time | Breadcrumb or scope accumulation | Reduce `maxBreadcrumbs`, verify `withScope` is used (not `configureScope`) |
| Lost events on deploy/restart | No `Sentry.close()` in shutdown handler | Add SIGTERM/SIGINT handlers calling `Sentry.close(2000)` |
| Distributed traces broken at scale | Mixed sampling decisions across services | Always check `parentSampled` first in `tracesSampler` |
| Clickhouse OOM on self-hosted | Insufficient memory for event volume | Allocate 16G+ RAM, increase Snuba consumer replicas |
| k6 shows >5ms Sentry overhead | Too many integrations or large payloads | Disable Console/ContextLines integrations, reduce `maxValueLength` |
| Quota burn from replay/attachments | Replays not rate-limited separately | Set `replaysSessionSampleRate: 0.01` and `replaysOnErrorSampleRate: 0.1` |
## Examples
**Minimal high-scale init (copy-paste ready):**
```typescript
import * as Sentry from '@sentry/node';
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV,
release: `${process.env.SERVICE_NAME}@${process.env.VERSION}`,
maxBreadcrumbs: 15,
maxValueLength: 200,
profilesSampleRate: 0,
tracesSampler: ({ name, parentSampled }) => {
if (parentSampled !== undefined) return parentSampled ? 1.0 : 0;
if (name?.match(/\/(health|ping|metrics)/)) return 0;
if (name?.includes('/payment')) return 1.0;
if (name?.startsWith('POST /api/')) return 0.05;
return 0.005;
},
});
```
**Verify sampling is working as expected:**
```typescript
// Add to non-production environments temporarily
Sentry.init({
// ... config ...
tracesSampler: (ctx) => {
const rate = calculateRate(ctx); // your logic
if (process.env.DEBUG_SENTRY === 'true') {
console.log(`[sentry] ${ctx.name} → rate=${rate}`);
}
return rate;
},
});
```
## Resources
- [Quota Management](https://docs.sentry.io/pricing/quotas/) — spike protection, rate limits, reserved volume
- [Sampling Configuration](https://docs.sentry.io/platforms/javascript/configuration/sampling/) — tracesSampler API reference
- [Transport Configuration](https://docs.sentry.io/platforms/javascript/configuration/transports/) — custom transport, buffer size
- [Self-Hosted Sentry](https://develop.sentry.dev/self-hosted/) — installation and scaling guide
- [Pricing Calculator](https://sentry.io/pricing/) — estimate costs by event volume
- [SDK Performance Overhead](https://docs.sentry.io/platforms/javascript/performance/) — benchmarks and best practices
## Next Steps
- Run the k6 load test against staging to establish your baseline Sentry overhead
- Set up Sentry Spike Protection (Settings > Subscription > Spike Protection) before going to production
- Configure server-side sampling rules in Sentry Dynamic Sampling (Project Settings > Performance) to complement client-side `tracesSampler`
- Create a Sentry dashboard with widgets for: events/hour by category, quota usage %, p95 SDK overhead
- Review the `sentry-cost-tuning` skill for detailed quota optimization strategiesRelated Skills
testing-load-balancers
Validate load balancer behavior, failover, and traffic distribution. Use when performing specialized testing. Trigger with phrases like "test load balancer", "validate failover", or "check traffic distribution".
windsurf-load-scale
Scale Windsurf adoption across large organizations with workspace strategies and performance tuning. Use when rolling out Windsurf to 50+ developers, managing large monorepo workspaces, or planning enterprise-scale deployment. Trigger with phrases like "windsurf at scale", "windsurf large team", "windsurf monorepo", "windsurf organization", "windsurf 100 developers".
vercel-load-scale
Load test and scale Vercel deployments with concurrency tuning and capacity planning. Use when running performance tests, planning for traffic spikes, or optimizing serverless function scaling on Vercel. Trigger with phrases like "vercel load test", "vercel scale", "vercel performance test", "vercel capacity", "vercel benchmark".
supabase-load-scale
Scale Supabase projects for production load: read replicas, connection pooling tuning via Supavisor, compute size upgrades, CDN caching for Storage, Edge Function regional deployment, and database table partitioning. Use when preparing for traffic spikes, optimizing connection limits, setting up read replicas for analytics queries, or partitioning large tables. Trigger with phrases like "supabase scale", "supabase read replica", "supabase connection pooling", "supabase compute upgrade", "supabase CDN storage", "supabase edge function regions", "supabase partitioning", "supavisor", "supabase pool mode".
snowflake-load-scale
Implement Snowflake load testing, warehouse scaling, and capacity planning. Use when testing query performance at scale, configuring multi-cluster warehouses, or planning capacity for production Snowflake workloads. Trigger with phrases like "snowflake load test", "snowflake scale", "snowflake capacity", "snowflake benchmark", "snowflake multi-cluster".
shopify-load-scale
Load test Shopify integrations respecting API rate limits, plan capacity with k6, and scale for Shopify Plus burst events (flash sales, BFCM). Trigger with phrases like "shopify load test", "shopify scale", "shopify BFCM", "shopify flash sale", "shopify capacity", "shopify k6 test".
sentry-upgrade-migration
Upgrade Sentry SDK versions and migrate breaking API changes. Use when upgrading from Sentry v7 to v8, migrating Python SDK v1 to v2, replacing deprecated Hub/Transaction APIs, or running the migr8 codemod. Trigger: "upgrade sentry", "sentry migration", "sentry breaking changes", "migrate sentry v7 to v8", "update sentry sdk".
sentry-security-basics
Configure Sentry security settings and data protection. Use when setting up PII scrubbing, managing sensitive data, configuring data scrubbing rules, or hardening Sentry for compliance. Trigger with phrases like "sentry security", "sentry PII", "sentry data scrubbing", "secure sentry", "sentry GDPR".
sentry-sdk-patterns
Best practices for using Sentry SDK in TypeScript and Python. Use when implementing structured error context with scopes, breadcrumb strategies, beforeSend/beforeBreadcrumb filtering, custom fingerprinting, user context, or performance span creation. Trigger: "sentry best practices", "sentry patterns", "sentry sdk usage", "sentry scope", "sentry breadcrumbs", "sentry beforeSend", "sentry fingerprint".
sentry-reliability-patterns
Build reliable Sentry integrations with graceful degradation, circuit breakers, and offline queuing. Use when implementing fault-tolerant error tracking, handling SDK initialization failures, building retry logic for Sentry transports, or ensuring apps survive Sentry outages. Trigger with "sentry reliability", "sentry circuit breaker", "sentry offline queue", "sentry graceful degradation", "sentry failover", or "resilient sentry setup".
sentry-release-management
Manage Sentry releases with versioning, commit association, and source map uploads. Use when creating releases, linking commits to errors, uploading release artifacts, monitoring release health, or cleaning up old releases. Trigger with phrases like "sentry release", "create sentry version", "sentry source maps", "sentry suspect commits", "release health".
sentry-reference-architecture
Design production-grade Sentry architecture for multi-service organizations. Use when planning Sentry rollout, structuring projects across teams, building shared config modules, or setting up distributed tracing. Trigger: "sentry architecture", "sentry project structure", "sentry reference design", "sentry distributed tracing".