snowflake-load-scale

Implement Snowflake load testing, warehouse scaling, and capacity planning. Use when testing query performance at scale, configuring multi-cluster warehouses, or planning capacity for production Snowflake workloads. Trigger with phrases like "snowflake load test", "snowflake scale", "snowflake capacity", "snowflake benchmark", "snowflake multi-cluster".

1,868 stars

Best use case

snowflake-load-scale is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Implement Snowflake load testing, warehouse scaling, and capacity planning. Use when testing query performance at scale, configuring multi-cluster warehouses, or planning capacity for production Snowflake workloads. Trigger with phrases like "snowflake load test", "snowflake scale", "snowflake capacity", "snowflake benchmark", "snowflake multi-cluster".

Teams using snowflake-load-scale should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/snowflake-load-scale/SKILL.md --create-dirs "https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/main/plugins/saas-packs/snowflake-pack/skills/snowflake-load-scale/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/snowflake-load-scale/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How snowflake-load-scale Compares

Feature / Agentsnowflake-load-scaleStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Implement Snowflake load testing, warehouse scaling, and capacity planning. Use when testing query performance at scale, configuring multi-cluster warehouses, or planning capacity for production Snowflake workloads. Trigger with phrases like "snowflake load test", "snowflake scale", "snowflake capacity", "snowflake benchmark", "snowflake multi-cluster".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Snowflake Load & Scale

## Overview

Load testing, scaling strategies, and capacity planning for Snowflake workloads using warehouse sizing, multi-cluster configuration, and concurrent query simulation.

## Scaling Model

| Dimension | How to Scale | When |
|-----------|-------------|------|
| Single query speed | Scale UP (bigger warehouse) | Complex queries, large scans |
| Concurrent queries | Scale OUT (multi-cluster) | Many users, dashboard refresh |
| Data volume | Scale UP + clustering | Tables > 1TB |
| Mixed workloads | Separate warehouses | ETL + analytics on same data |

## Instructions

### Step 1: Benchmark Current Performance

```sql
-- Baseline metrics for critical queries
-- Run each query 3 times and record results

-- Disable result cache for accurate benchmarking
ALTER SESSION SET USE_CACHED_RESULT = FALSE;

-- Test query 1: Point lookup
SELECT * FROM orders WHERE order_id = 12345;

-- Test query 2: Aggregation
SELECT DATE_TRUNC('month', order_date) AS month,
       COUNT(*) AS orders, SUM(amount) AS revenue
FROM orders
WHERE order_date >= '2025-01-01'
GROUP BY month ORDER BY month;

-- Test query 3: Join + filter
SELECT c.name, SUM(o.amount) AS total_spend
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= DATEADD(days, -90, CURRENT_DATE())
GROUP BY c.name
ORDER BY total_spend DESC
LIMIT 100;

-- Record results
SELECT query_id, query_text, warehouse_name, warehouse_size,
       total_elapsed_time / 1000 AS seconds,
       bytes_scanned / 1e9 AS gb_scanned,
       rows_produced, partitions_scanned, partitions_total,
       bytes_spilled_to_local_storage, bytes_spilled_to_remote_storage
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY_BY_SESSION())
ORDER BY start_time DESC
LIMIT 10;

-- Re-enable cache
ALTER SESSION SET USE_CACHED_RESULT = TRUE;
```

### Step 2: Test Warehouse Size Impact

```sql
-- Run same query on different warehouse sizes to find optimal
-- XS → S → M → L → XL

ALTER WAREHOUSE BENCHMARK_WH SET WAREHOUSE_SIZE = 'XSMALL';
ALTER SESSION SET USE_CACHED_RESULT = FALSE;

-- Run your benchmark query
SELECT /* BENCHMARK_XS */ ...;

ALTER WAREHOUSE BENCHMARK_WH SET WAREHOUSE_SIZE = 'SMALL';
SELECT /* BENCHMARK_S */ ...;

ALTER WAREHOUSE BENCHMARK_WH SET WAREHOUSE_SIZE = 'MEDIUM';
SELECT /* BENCHMARK_M */ ...;

-- Compare results
SELECT warehouse_size, query_id,
       total_elapsed_time / 1000 AS seconds,
       bytes_scanned / 1e9 AS gb_scanned
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY_BY_SESSION())
WHERE query_text LIKE '%BENCHMARK_%'
ORDER BY start_time DESC;

-- Typical scaling: doubling size halves runtime for scan-heavy queries
-- Diminishing returns for small/simple queries
```

### Step 3: Concurrent Load Testing

```python
# load_test.py — simulate concurrent Snowflake queries
import snowflake.connector
import threading
import time
import os
from statistics import mean, median

CONCURRENT_USERS = 20
QUERIES_PER_USER = 10
WAREHOUSE = 'LOAD_TEST_WH'

TEST_QUERIES = [
    "SELECT COUNT(*) FROM orders WHERE order_date = CURRENT_DATE() - 1",
    "SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id LIMIT 100",
    "SELECT * FROM orders WHERE order_id = %s",
]

results = []
errors = []

def run_user_session(user_id: int):
    conn = snowflake.connector.connect(
        account=os.environ['SNOWFLAKE_ACCOUNT'],
        user=os.environ['SNOWFLAKE_USER'],
        password=os.environ['SNOWFLAKE_PASSWORD'],
        warehouse=WAREHOUSE,
        database='PROD_DW',
        schema='GOLD',
    )
    cursor = conn.cursor()
    for i in range(QUERIES_PER_USER):
        query = TEST_QUERIES[i % len(TEST_QUERIES)]
        start = time.time()
        try:
            if '%s' in query:
                cursor.execute(query, (user_id * 1000 + i,))
            else:
                cursor.execute(query)
            cursor.fetchall()
            elapsed = time.time() - start
            results.append({'user': user_id, 'query': i, 'seconds': elapsed})
        except Exception as e:
            errors.append({'user': user_id, 'query': i, 'error': str(e)})
    conn.close()

# Run concurrent sessions
threads = []
start_time = time.time()
for uid in range(CONCURRENT_USERS):
    t = threading.Thread(target=run_user_session, args=(uid,))
    threads.append(t)
    t.start()
for t in threads:
    t.join()
total_time = time.time() - start_time

# Report
times = [r['seconds'] for r in results]
print(f"=== Load Test Results ===")
print(f"Users: {CONCURRENT_USERS}, Queries/user: {QUERIES_PER_USER}")
print(f"Total queries: {len(results)}, Errors: {len(errors)}")
print(f"Total time: {total_time:.1f}s")
print(f"Avg latency: {mean(times):.3f}s")
print(f"Median: {median(times):.3f}s")
print(f"P95: {sorted(times)[int(len(times)*0.95)]:.3f}s")
print(f"QPS: {len(results)/total_time:.1f}")
```

### Step 4: Multi-Cluster Warehouse Configuration

```sql
-- Standard scaling: Snowflake adds clusters when queries queue
CREATE OR REPLACE WAREHOUSE ANALYTICS_WH
  WAREHOUSE_SIZE = 'MEDIUM'
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 6
  SCALING_POLICY = 'STANDARD'
  AUTO_SUSPEND = 300
  AUTO_RESUME = TRUE;

-- Economy scaling: tolerates queuing, minimizes cost
ALTER WAREHOUSE ANALYTICS_WH SET SCALING_POLICY = 'ECONOMY';

-- Maximized mode: all clusters always running (predictable latency)
CREATE WAREHOUSE DASHBOARD_WH
  WAREHOUSE_SIZE = 'SMALL'
  MIN_CLUSTER_COUNT = 3
  MAX_CLUSTER_COUNT = 3    -- Same = maximized mode
  AUTO_SUSPEND = 120
  AUTO_RESUME = TRUE;

-- Monitor multi-cluster behavior
SELECT start_time, warehouse_name,
       avg_running, avg_queued_load, avg_queued_provisioning
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_LOAD_HISTORY(
  DATE_RANGE_START => DATEADD(hours, -4, CURRENT_TIMESTAMP()),
  WAREHOUSE_NAME => 'ANALYTICS_WH'
))
WHERE avg_queued_load > 0
ORDER BY start_time DESC;
```

### Step 5: Capacity Planning

```sql
-- Weekly growth analysis
SELECT DATE_TRUNC('week', start_time) AS week,
       SUM(credits_used) AS weekly_credits,
       COUNT(DISTINCT query_id) AS weekly_queries,
       ROUND(SUM(credits_used) / NULLIF(COUNT(DISTINCT query_id), 0), 4) AS credits_per_query
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY w
JOIN SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY q
  ON w.warehouse_name = q.warehouse_name
WHERE w.start_time >= DATEADD(months, -3, CURRENT_TIMESTAMP())
GROUP BY week
ORDER BY week;

-- Storage growth trend
SELECT usage_date,
       ROUND(storage_bytes / 1e12, 3) AS data_tb,
       LAG(ROUND(storage_bytes / 1e12, 3)) OVER (ORDER BY usage_date) AS prev_tb,
       ROUND((storage_bytes - LAG(storage_bytes) OVER (ORDER BY usage_date)) / 1e9, 1) AS daily_growth_gb
FROM SNOWFLAKE.ACCOUNT_USAGE.STORAGE_USAGE
WHERE usage_date >= DATEADD(days, -30, CURRENT_DATE())
ORDER BY usage_date;
```

## Benchmark Results Template

```
## Snowflake Performance Benchmark
Date: YYYY-MM-DD
Environment: [staging/production]
Table size: [X rows, Y GB]

| Warehouse | Query Type | Avg (s) | P95 (s) | GB Scanned | Spill |
|-----------|-----------|---------|---------|-----------|-------|
| XS        | Agg       |         |         |           |       |
| S         | Agg       |         |         |           |       |
| M         | Agg       |         |         |           |       |

Concurrent: [N users, M queries, QPS achieved]
Recommendation: [sizing/clustering/multi-cluster advice]
```

## Error Handling

| Issue | Cause | Solution |
|-------|-------|----------|
| Queries queuing | Concurrency > capacity | Add multi-cluster or separate warehouse |
| Linear scaling fails | Query not parallelizable | Optimize SQL (reduce shuffle) |
| Spilling on larger warehouse | Data skew | Check for hot partition/join skew |
| Load test throttled | Login rate limit | Use connection pooling |

## Resources

- [Warehouse Considerations](https://docs.snowflake.com/en/user-guide/warehouses-considerations)
- [Multi-Cluster Warehouses](https://docs.snowflake.com/en/user-guide/warehouses-multicluster)
- [Warehouse Load Monitoring](https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring)

## Next Steps

For reliability patterns, see `snowflake-reliability-patterns`.

Related Skills

testing-load-balancers

1868
from jeremylongshore/claude-code-plugins-plus-skills

Validate load balancer behavior, failover, and traffic distribution. Use when performing specialized testing. Trigger with phrases like "test load balancer", "validate failover", or "check traffic distribution".

windsurf-load-scale

1868
from jeremylongshore/claude-code-plugins-plus-skills

Scale Windsurf adoption across large organizations with workspace strategies and performance tuning. Use when rolling out Windsurf to 50+ developers, managing large monorepo workspaces, or planning enterprise-scale deployment. Trigger with phrases like "windsurf at scale", "windsurf large team", "windsurf monorepo", "windsurf organization", "windsurf 100 developers".

vercel-load-scale

1868
from jeremylongshore/claude-code-plugins-plus-skills

Load test and scale Vercel deployments with concurrency tuning and capacity planning. Use when running performance tests, planning for traffic spikes, or optimizing serverless function scaling on Vercel. Trigger with phrases like "vercel load test", "vercel scale", "vercel performance test", "vercel capacity", "vercel benchmark".

supabase-load-scale

1868
from jeremylongshore/claude-code-plugins-plus-skills

Scale Supabase projects for production load: read replicas, connection pooling tuning via Supavisor, compute size upgrades, CDN caching for Storage, Edge Function regional deployment, and database table partitioning. Use when preparing for traffic spikes, optimizing connection limits, setting up read replicas for analytics queries, or partitioning large tables. Trigger with phrases like "supabase scale", "supabase read replica", "supabase connection pooling", "supabase compute upgrade", "supabase CDN storage", "supabase edge function regions", "supabase partitioning", "supavisor", "supabase pool mode".

snowflake-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Upgrade Snowflake drivers, handle breaking changes, and migrate between editions. Use when upgrading snowflake-sdk or snowflake-connector-python versions, migrating between Snowflake editions, or handling deprecations. Trigger with phrases like "upgrade snowflake", "snowflake migration", "snowflake breaking changes", "update snowflake driver", "snowflake version".

snowflake-security-basics

1868
from jeremylongshore/claude-code-plugins-plus-skills

Apply Snowflake security best practices: network policies, key rotation, MFA, encryption, and least-privilege access. Use when securing Snowflake access, implementing network policies, or auditing security configuration. Trigger with phrases like "snowflake security", "snowflake network policy", "secure snowflake", "snowflake MFA", "snowflake encryption".

snowflake-sdk-patterns

1868
from jeremylongshore/claude-code-plugins-plus-skills

Apply production-ready Snowflake SDK patterns for snowflake-sdk and snowflake-connector-python. Use when implementing connection pooling, async execute wrappers, streaming results, or establishing team coding standards for Snowflake. Trigger with phrases like "snowflake SDK patterns", "snowflake best practices", "snowflake code patterns", "idiomatic snowflake", "snowflake connection pool".

snowflake-reliability-patterns

1868
from jeremylongshore/claude-code-plugins-plus-skills

Implement Snowflake reliability patterns: replication, failover, Time Travel recovery, and application-level resilience for Snowflake integrations. Use when building fault-tolerant pipelines, configuring disaster recovery, or adding resilience to production Snowflake services. Trigger with phrases like "snowflake reliability", "snowflake failover", "snowflake replication", "snowflake disaster recovery", "snowflake Time Travel".

snowflake-reference-architecture

1868
from jeremylongshore/claude-code-plugins-plus-skills

Implement Snowflake reference architecture with medallion pattern and Snowflake-native design. Use when designing a new Snowflake data platform, setting up bronze/silver/gold layers, or establishing architecture standards for a Snowflake deployment. Trigger with phrases like "snowflake architecture", "snowflake medallion", "snowflake best practices layout", "snowflake data platform design".

snowflake-rate-limits

1868
from jeremylongshore/claude-code-plugins-plus-skills

Handle Snowflake concurrency limits, warehouse queuing, and query throttling. Use when queries are queuing, hitting concurrency limits, or needing to optimize warehouse sizing for throughput. Trigger with phrases like "snowflake rate limit", "snowflake throttling", "snowflake queuing", "snowflake concurrency", "snowflake warehouse sizing".

snowflake-prod-checklist

1868
from jeremylongshore/claude-code-plugins-plus-skills

Execute Snowflake production readiness checklist with monitoring and rollback. Use when deploying Snowflake pipelines to production, preparing for go-live, or validating production Snowflake configuration. Trigger with phrases like "snowflake production", "snowflake go-live", "snowflake launch checklist", "snowflake prod ready".

snowflake-policy-guardrails

1868
from jeremylongshore/claude-code-plugins-plus-skills

Implement Snowflake governance guardrails with network rules, session policies, authentication policies, and automated compliance checks. Use when enforcing security policies, implementing data governance, or configuring automated compliance for Snowflake. Trigger with phrases like "snowflake policy", "snowflake guardrails", "snowflake governance", "snowflake compliance", "snowflake enforce".