nw-query-optimization

SQL and NoSQL query optimization techniques, indexing strategies, execution plan analysis, JOIN algorithms, cardinality estimation, and database-specific query patterns

322 stars

Best use case

nw-query-optimization is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

SQL and NoSQL query optimization techniques, indexing strategies, execution plan analysis, JOIN algorithms, cardinality estimation, and database-specific query patterns

Teams using nw-query-optimization should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/nw-query-optimization/SKILL.md --create-dirs "https://raw.githubusercontent.com/nWave-ai/nWave/main/nWave/skills/nw-query-optimization/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/nw-query-optimization/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How nw-query-optimization Compares

Feature / Agentnw-query-optimizationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

SQL and NoSQL query optimization techniques, indexing strategies, execution plan analysis, JOIN algorithms, cardinality estimation, and database-specific query patterns

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Query Optimization

## Cost-Based Optimization

Modern relational DBs use cost-based optimizers (CBO): generate plan candidates -> estimate cost via statistics (row counts, distributions, selectivity) -> select lowest I/O/CPU/memory plan. Stale statistics lead to suboptimal plans.

### Execution Plan Analysis

Validate optimization with EXPLAIN before and after changes.

```sql
-- PostgreSQL (add ANALYZE for actual runtime stats)
EXPLAIN ANALYZE SELECT order_id, total FROM orders WHERE customer_id = 12345;
-- MySQL: EXPLAIN FORMAT=JSON ... | SQL Server: SET STATISTICS IO ON
```

Key indicators: **Seq Scan/Table Scan** = missing index | **Index Scan/Seek** = efficient | **Hash Join** = large equality joins | **Nested Loop** = small/indexed inner | **Merge Join** = pre-sorted inputs | **Sort** = watch disk spills

## Indexing Strategies

### B-Tree (Default)
Supports: equality, range, sorting, prefix matching | O(log n) lookup | General-purpose, all major DBs default

### Hash
Equality only | O(1) lookup | High-cardinality exact-match | No range/sorting/pattern support

### Covering Indexes
Include all query columns in index -> eliminates table access (index-only scan) | Trade-off: larger index, slower writes

```sql
-- Covering index for: SELECT name, email FROM users WHERE status = 'active'
CREATE INDEX idx_users_status_covering ON users(status) INCLUDE (name, email);
```

### PostgreSQL Specialized
- **GiST**: Geometric data, full-text search, nearest-neighbor
- **GIN**: Arrays, full-text search, JSONB queries
- **BRIN**: Large tables with physically correlated data (timestamps), minimal storage
- **SP-GiST**: Non-balanced structures, point-based geometric queries

### Compound Index Design
Order by: 1. Equality conditions first (highest selectivity) | 2. Sort columns second | 3. Range conditions last

### MongoDB ESR Rule
Equality-Sort-Range ordering for compound indexes:
```javascript
// Query: status = "A", qty > 20, sorted by item
// Optimal index:
db.collection.createIndex({ status: 1, item: 1, qty: 1 })
//                          E(quality)  S(ort)   R(ange)
```

## SQL Optimization Patterns

### Select Only Needed Columns
```sql
-- Bad: SELECT * retrieves unnecessary data, prevents covering indexes
SELECT * FROM orders WHERE customer_id = 12345;

-- Good: Specify columns, enables covering index
SELECT order_id, order_date, total FROM orders WHERE customer_id = 12345;
```

### Other Key Patterns
- **CTEs**: Improve readability but not always performance -- PostgreSQL may materialize CTEs (pre-v12), MySQL inlines them
- **Window functions**: Use `SUM() OVER`, `RANK() OVER (PARTITION BY ...)` for analytics without self-joins
- **Pagination**: Prefer keyset (`WHERE id > last_seen ORDER BY id LIMIT N`) over OFFSET for deep pages
- **Parameterized queries**: Prevent SQL injection AND enable plan caching (`cursor.execute("... WHERE id = %s", (id,))`)

## JOIN Algorithm Selection

| Algorithm | Best When | Cost |
|-----------|-----------|------|
| Nested Loop | Small outer table, indexed inner table | O(n * m) worst, O(n * log m) with index |
| Hash Join | Large tables, equality joins, no useful indexes | O(n + m) build + probe |
| Merge Join | Both inputs already sorted (index order) | O(n + m) after sort |

## Cardinality Estimation

Optimizer predicts row counts using: **Histograms** (value distribution) | **Density vectors** (non-histogram columns) | **Statistics objects** via ANALYZE (PostgreSQL) / UPDATE STATISTICS (SQL Server)

When estimation is wrong (correlated columns, skewed data, multi-table joins): 1. Run ANALYZE/UPDATE STATISTICS | 2. Create multi-column statistics | 3. Query hints as last resort

## NoSQL Query Optimization

### MongoDB
Place `$match`/`$project` early in pipelines | Use `$lookup` sparingly (left outer joins) | Compound indexes following ESR | Validate with `explain("executionStats")`

### Cassandra
Always include partition key | Design tables around query patterns (query-first) | Use SAI over SASI (43% throughput gain) | Avoid ALLOW FILTERING (full cluster scan) | Materialized views add write overhead

### DynamoDB
Use Query not Scan | Design partition keys for even distribution | GSIs for alternative access patterns | Single-table design with composite sort keys

### Redis
FT.SEARCH for complex queries (RediSearch module) | Design key naming for efficient SCAN | Use pipelining for batch ops

## Anti-Patterns to Detect

- **SELECT ***: Wastes I/O, prevents covering indexes
- **Missing indexes** on WHERE/JOIN/ORDER BY columns: full table scans
- **N+1 queries**: Fetch in loops instead of JOINs/batch
- **Implicit type conversions**: Prevents index use (WHERE varchar_col = 123)
- **Functions on indexed columns**: `WHERE UPPER(name) = 'JOHN'` blocks index; use function-based indexes
- **Missing pagination**: Unbounded result sets
- **Hot partitions** (NoSQL): Low-cardinality partition keys concentrate load
- **ALLOW FILTERING** (Cassandra): Expensive full-cluster scans
- **Large partitions** (Cassandra): >100MB degrades performance

Related Skills

nw-command-optimization-workflow

322
from nWave-ai/nWave

Step-by-step workflow for converting bloated command files to lean declarative definitions

nw-ux-web-patterns

322
from nWave-ai/nWave

Web UI design patterns for product owners. Load when designing web application interfaces, writing web-specific acceptance criteria, or evaluating responsive designs.

nw-ux-tui-patterns

322
from nWave-ai/nWave

Terminal UI and CLI design patterns for product owners. Load when designing command-line tools, interactive terminal applications, or writing CLI-specific acceptance criteria.

nw-ux-principles

322
from nWave-ai/nWave

Core UX principles for product owners. Load when evaluating interface designs, writing acceptance criteria with UX requirements, or reviewing wireframes and mockups.

nw-ux-emotional-design

322
from nWave-ai/nWave

Emotional design and delight patterns for product owners. Load when designing onboarding flows, empty states, first-run experiences, or evaluating the emotional quality of an interface.

nw-ux-desktop-patterns

322
from nWave-ai/nWave

Desktop application UI patterns for product owners. Load when designing native or cross-platform desktop applications, writing desktop-specific acceptance criteria, or evaluating panel layouts and keyboard workflows.

nw-user-story-mapping

322
from nWave-ai/nWave

User story mapping for backlog management and outcome-based prioritization. Load during Phase 2.5 (User Story Mapping) to produce story-map.md and prioritization.md.

nw-tr-review-criteria

322
from nWave-ai/nWave

Review dimensions and scoring for root cause analysis quality assessment

nw-tlaplus-verification

322
from nWave-ai/nWave

TLA+ formal verification for design correctness and PBT pipeline integration

nw-test-refactoring-catalog

322
from nWave-ai/nWave

Detailed refactoring mechanics with step-by-step procedures, and test code smell catalog with detection patterns and before/after examples

nw-test-organization-conventions

322
from nWave-ai/nWave

Test directory structure patterns by architecture style, language conventions, naming rules, and fixture placement. Decision tree for selecting test organization strategy.

nw-test-design-mandates

322
from nWave-ai/nWave

Four design mandates for acceptance tests - hexagonal boundary enforcement, business language abstraction, user journey completeness, walking skeleton strategy, and pure function extraction