sql-query-optimizer
Analyzes and optimizes SQL queries across different data warehouse platforms (Snowflake, BigQuery, Redshift, Databricks) with platform-specific recommendations.
Best use case
sql-query-optimizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyzes and optimizes SQL queries across different data warehouse platforms (Snowflake, BigQuery, Redshift, Databricks) with platform-specific recommendations.
Teams using sql-query-optimizer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/sql-query-optimizer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How sql-query-optimizer Compares
| Feature / Agent | sql-query-optimizer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyzes and optimizes SQL queries across different data warehouse platforms (Snowflake, BigQuery, Redshift, Databricks) with platform-specific recommendations.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# SQL Query Optimizer
Analyzes and optimizes SQL queries across different data warehouse platforms with platform-specific recommendations.
## Overview
This skill examines SQL queries to identify performance bottlenecks, suggest optimizations, and provide platform-specific recommendations for Snowflake, BigQuery, Redshift, and Databricks. It analyzes query execution plans, recommends indexes/clustering keys, and identifies anti-patterns.
## Capabilities
- **Query execution plan analysis** - Parse and analyze EXPLAIN outputs
- **Index recommendations** - Suggest clustering keys, sort keys, partition keys
- **Join optimization** - Identify inefficient join patterns and suggest improvements
- **Subquery elimination** - Convert correlated subqueries to CTEs or joins
- **CTE optimization** - Materialize vs reference optimization
- **Window function optimization** - Frame and partition optimization
- **Predicate pushdown validation** - Verify filter pushdown effectiveness
- **Clustering key recommendations** - Platform-specific clustering strategies
- **Materialized view suggestions** - Identify candidates for materialized views
- **Platform-specific optimizations** - Snowflake, BigQuery, Redshift, Databricks
## Input Schema
```json
{
"query": {
"type": "string",
"description": "The SQL query to analyze",
"required": true
},
"platform": {
"type": "string",
"enum": ["snowflake", "bigquery", "redshift", "databricks", "postgres"],
"required": true,
"description": "Target data warehouse platform"
},
"tableStatistics": {
"type": "object",
"description": "Table statistics including row counts, column cardinality",
"properties": {
"tables": {
"type": "array",
"items": {
"name": "string",
"rowCount": "number",
"sizeGB": "number",
"columns": "array"
}
}
}
},
"executionPlan": {
"type": "object",
"description": "Query execution plan (EXPLAIN output)"
},
"queryHistory": {
"type": "object",
"description": "Historical query performance metrics"
},
"optimizationGoals": {
"type": "array",
"items": {
"type": "string",
"enum": ["latency", "cost", "throughput", "scan_reduction"]
},
"default": ["latency", "cost"]
}
}
```
## Output Schema
```json
{
"optimizedQuery": {
"type": "string",
"description": "The optimized SQL query"
},
"improvements": {
"type": "array",
"items": {
"type": {
"type": "string",
"enum": ["join", "predicate", "aggregation", "cte", "window", "scan", "index"]
},
"description": "string",
"impact": "high|medium|low",
"lineNumber": "number",
"originalCode": "string",
"optimizedCode": "string"
}
},
"indexRecommendations": {
"type": "array",
"items": {
"table": "string",
"type": "clustering|sort|partition|index",
"columns": "array",
"rationale": "string",
"ddl": "string"
}
},
"estimatedImprovement": {
"scanReduction": {
"type": "number",
"description": "Percentage reduction in data scanned"
},
"timeReduction": {
"type": "number",
"description": "Percentage reduction in execution time"
},
"costReduction": {
"type": "number",
"description": "Percentage reduction in query cost"
}
},
"antiPatterns": {
"type": "array",
"items": {
"pattern": "string",
"severity": "high|medium|low",
"location": "string",
"suggestion": "string"
}
},
"platformSpecificNotes": {
"type": "array",
"items": "string"
}
}
```
## Usage Examples
### Basic Query Optimization
```json
{
"query": "SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.created_at > '2024-01-01'",
"platform": "snowflake"
}
```
### With Execution Plan Analysis
```json
{
"query": "SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id",
"platform": "bigquery",
"executionPlan": {
"stages": [...],
"totalBytesProcessed": 1073741824
},
"optimizationGoals": ["cost", "scan_reduction"]
}
```
### With Table Statistics
```json
{
"query": "SELECT ... complex query ...",
"platform": "redshift",
"tableStatistics": {
"tables": [
{
"name": "orders",
"rowCount": 10000000,
"sizeGB": 50,
"columns": [
{"name": "order_id", "cardinality": 10000000},
{"name": "customer_id", "cardinality": 500000}
]
}
]
}
}
```
## Platform-Specific Optimizations
### Snowflake
| Optimization | Description |
|--------------|-------------|
| Clustering keys | Recommend micro-partition clustering |
| Result cache | Identify queries benefiting from caching |
| Query acceleration | Suggest QUERY_ACCELERATION_MAX_SCALE_FACTOR |
| Warehouse sizing | Right-size warehouse recommendations |
### BigQuery
| Optimization | Description |
|--------------|-------------|
| Partitioning | DATE/TIMESTAMP partitioning recommendations |
| Clustering | Up to 4 clustering columns |
| BI Engine | Identify BI Engine-eligible queries |
| Slots | Estimate slot usage optimization |
### Redshift
| Optimization | Description |
|--------------|-------------|
| Sort keys | COMPOUND vs INTERLEAVED recommendations |
| Distribution | KEY, EVEN, ALL distribution strategies |
| Compression | Column encoding recommendations |
| Vacuum | VACUUM and ANALYZE recommendations |
### Databricks
| Optimization | Description |
|--------------|-------------|
| Z-ordering | Multi-column Z-order recommendations |
| Delta cache | Caching strategy recommendations |
| Photon | Photon-eligible query patterns |
| Adaptive execution | AQE configuration suggestions |
## Common Anti-Patterns Detected
### Query Structure
| Anti-Pattern | Impact | Fix |
|--------------|--------|-----|
| SELECT * | High | Specify columns explicitly |
| Correlated subqueries | High | Convert to JOIN or CTE |
| DISTINCT on large datasets | Medium | Use GROUP BY or window functions |
| Non-SARGable predicates | High | Rewrite for index usage |
### Join Issues
| Anti-Pattern | Impact | Fix |
|--------------|--------|-----|
| Cartesian products | Critical | Add join conditions |
| Implicit joins | Medium | Use explicit JOIN syntax |
| Wrong join order | High | Reorder by selectivity |
| Missing indexes on join keys | High | Add clustering/sort keys |
### Aggregation Issues
| Anti-Pattern | Impact | Fix |
|--------------|--------|-----|
| GROUP BY ordinal | Low | Use column names |
| Aggregating before filter | High | Filter first, then aggregate |
| Over-grouping | Medium | Reduce GROUP BY columns |
## Integration Points
### MCP Server Integration
- **Snowflake MCP** - Real-time execution plan analysis
- **BigQuery MCP** - Cost estimation and slot analysis
- **Redshift MCP** - Query execution and statistics
### Related Skills
- Data Quality Profiler (SK-DEA-005) - Table statistics gathering
- dbt Project Analyzer (SK-DEA-003) - Model query optimization
### Applicable Processes
- Query Optimization (`query-optimization.js`)
- Data Warehouse Setup (`data-warehouse-setup.js`)
- BI Dashboard Development (`bi-dashboard.js`)
- OBT Creation (`obt-creation.js`)
## References
- [Snowflake Query Performance](https://docs.snowflake.com/en/user-guide/performance-query)
- [BigQuery Best Practices](https://cloud.google.com/bigquery/docs/best-practices-performance-overview)
- [Redshift Performance Tuning](https://docs.aws.amazon.com/redshift/latest/dg/c_designing-queries-best-practices.html)
- [Databricks SQL Best Practices](https://docs.databricks.com/sql/language-manual/sql-ref-syntax-qry-select.html)
## Version History
- **1.0.0** - Initial release with multi-platform supportRelated Skills
react-query
TanStack Query (React Query) patterns for server state management, caching, mutations, optimistic updates, and infinite queries.
svg-optimizer
Optimize SVG assets, generate sprites, and convert to React components
db-query-analyzer
Analyze database query performance with execution plans and index recommendations
clinical-documentation-query
Generate compliant physician queries to clarify clinical documentation for accurate coding, severity of illness, and risk of mortality capture
doe-optimizer
Skill for optimizing experimental designs using DOE principles
circuit-optimizer
Quantum circuit optimization skill for gate reduction, depth minimization, and hardware-aware compilation
nanoparticle-synthesis-optimizer
Synthesis parameter optimization skill for metal, semiconductor, and oxide nanoparticle production with automated protocol generation and reproducibility validation
drug-encapsulation-optimizer
Drug delivery formulation skill for optimizing drug loading, encapsulation efficiency, and release kinetics
warehouse-slotting-optimizer
Warehouse slotting and layout optimization skill for pick path minimization and space utilization.
network-optimizer
Network optimization skill for transportation, assignment, and flow problems on graph structures.
facility-layout-optimizer
Facility layout optimization skill for material flow minimization and space utilization.
signal-timing-optimizer
Traffic signal timing optimization skill for cycle length, phasing, and coordination