neon-postgres-egress-optimizer

Diagnose and fix excessive Postgres egress (network data transfer) in a codebase. Use when a user mentions high database bills, unexpected data transfer costs, network transfer charges, egress spikes, "why is my Neon bill so high", "database costs jumped", SELECT * optimization, query overfetching, reduce Neon costs, optimize database usage, or wants to reduce data sent from their database to their application. Also use when reviewing query patterns for cost efficiency, even if the user doesn't explicitly mention egress or data transfer.

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

neon-postgres-egress-optimizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using neon-postgres-egress-optimizer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/neon-postgres-egress-optimizer/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/andrelandgraf/neon-postgres-egress-optimizer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/neon-postgres-egress-optimizer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How neon-postgres-egress-optimizer Compares

Feature / Agent	neon-postgres-egress-optimizer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# Postgres Egress Optimizer

Guide the user through diagnosing and fixing application-side query patterns that cause excessive data transfer (egress) from their Postgres database. Most high egress bills come from the application fetching more data than it uses.

## Step 1: Diagnose

Identify which queries transfer the most data. The primary tool is the `pg_stat_statements` extension.

### Check if pg_stat_statements is available

```sql
SELECT 1 FROM pg_stat_statements LIMIT 1;
```

If this errors, the extension needs to be created:

```sql
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
```

On Neon, it is available by default but may need this CREATE EXTENSION step.

### Handle empty stats

Stats are cleared when a Neon compute scales to zero and restarts. If the stats are empty or the compute recently woke up:

1. Reset the stats to start a clean measurement window: `SELECT pg_stat_statements_reset();`
2. Let the application run under representative traffic for at least an hour.
3. Return and run the diagnostic queries below.

If the user has stats from a production database, use those. If they have no access to production stats, proceed to Step 2 and analyze the codebase directly — code-level patterns are often sufficient to identify the worst offenders.

### Diagnostic queries

Run these to identify the top egress contributors. Focus on queries that return many rows, return wide rows (JSONB, TEXT, BYTEA columns), or are called very frequently.

**Queries returning the most total rows:**

```sql
SELECT query, calls, rows AS total_rows, rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE calls > 0
ORDER BY rows DESC
LIMIT 10;
```

**Queries returning the most rows per execution** (poorly scoped SELECTs, missing pagination):

```sql
SELECT query, calls, rows AS total_rows, rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE calls > 0
ORDER BY avg_rows_per_call DESC
LIMIT 10;
```

**Most frequently called queries** (candidates for caching):

```sql
SELECT query, calls, rows AS total_rows, rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE calls > 0
ORDER BY calls DESC
LIMIT 10;
```

**Longest running queries** (not a direct egress measure, but helps identify problem queries during a spike):

```sql
SELECT query, calls, rows AS total_rows,
  round(total_exec_time::numeric, 2) AS total_exec_time_ms
FROM pg_stat_statements
WHERE calls > 0
ORDER BY total_exec_time DESC
LIMIT 10;
```

### Interpret the results

Rank findings by estimated egress impact:

- **High row count + wide rows** = biggest egress. A query returning 1,000 rows where each row includes a 50KB JSONB column transfers ~50MB per call.
- **Extreme call frequency** on even small queries adds up. A query called 50,000 times/day returning 10 rows each = 500,000 rows/day.
- **Cross-reference with the schema** to identify which columns are wide. Look for JSONB, TEXT, BYTEA, and large VARCHAR columns.

## Step 2: Analyze codebase

For each query identified in Step 1, or for each database query in the codebase if no stats are available, check:

- Does it select only the columns the response needs?
- Does it return a bounded number of rows (LIMIT/pagination)?
- Is it called frequently enough to benefit from caching?
- Does it fetch raw data that gets aggregated in application code?
- Does it use a JOIN that duplicates parent data across child rows?

## Step 3: Fix

Apply the appropriate fix for each problem found. Below are the most common egress anti-patterns and how to fix them.

### Unused columns (SELECT \*)

**Problem:** The query fetches all columns but the application only uses a few. Large columns (JSONB blobs, TEXT fields) get transferred over the wire and discarded.

**Before:**

```sql
SELECT * FROM products;
```

**After:**

```sql
SELECT id, name, price, image_urls FROM products;
```

### Missing pagination

**Problem:** A list endpoint returns all rows with no LIMIT. This is an unbounded egress risk — every new row in the table increases data transfer on every request. Flag this regardless of current table size.

This is easy to miss because the application may work fine with small datasets. But at scale, an unpaginated endpoint returning 10,000 rows with even moderate column widths can transfer hundreds of megabytes per day.

**Before:**

```sql
SELECT id, name, price FROM products;
```

**After:**

```sql
SELECT id, name, price FROM products
ORDER BY id
LIMIT 50 OFFSET 0;
```

When adding pagination, check whether the consuming client already supports paginated responses. If not, pick sensible defaults and document the pagination parameters in the API.

### High-frequency queries on static data

**Problem:** A query is called thousands of times per day but returns data that rarely changes. Every call transfers the same rows from the database. This pattern is only visible from `pg_stat_statements` — the code itself looks normal.

Look for queries with extremely high call counts relative to other queries. Common examples: configuration tables, category lists, feature flags, user role definitions.

**Fix:** Add a caching layer between the application and the database so it avoids hitting the database on every request.

### Application-side aggregation

**Problem:** The application fetches all rows from a table and then computes aggregates (averages, counts, sums, groupings) in application code. The full dataset transfers over the wire even though the result is a small summary.

**Fix:** Push the aggregation into SQL.

**Before:** The application fetches entire tables and aggregates in code with loops or `.reduce()`.

**After:**

```sql
SELECT p.category_id,
       AVG(r.rating) AS avg_rating,
       COUNT(r.id) AS review_count
FROM reviews r
INNER JOIN products p ON r.product_id = p.id
GROUP BY p.category_id;
```

### JOIN duplication

**Problem:** A JOIN between a wide parent table and a child table duplicates all parent columns across every child row. If a product has 200 reviews and the product row includes a 50KB JSONB column, the join sends that 50KB × 200 = ~10MB for a single request.

This is distinct from the SELECT \* problem. Even if you select only needed columns, a JOIN still repeats the parent data for every child row. The fix is structural: avoid the join entirely.

**Before:**

```sql
SELECT * FROM products
LEFT JOIN reviews ON reviews.product_id = products.id
WHERE products.id = 1;
```

**After (two separate queries):**

```sql
SELECT id, name, price, description, image_urls FROM products WHERE id = 1;
SELECT id, user_name, rating, body FROM reviews WHERE product_id = 1;
```

Two queries instead of one JOIN. The product data is fetched once. The reviews are fetched once. No duplication.

## Step 4: Verify

After applying fixes:

1. **Run existing tests** to confirm nothing broke.
2. **Check the responses** — make sure the API still returns the same data shape. Column selection and pagination changes can break clients that depend on specific fields or full result sets.
3. **Measure the improvement** — if pg_stat_statements data is available, reset it (`SELECT pg_stat_statements_reset();`), let traffic run, then re-run the diagnostic queries to compare before and after.

## Further reading

- https://neon.com/docs/introduction/network-transfer.md
- https://neon.com/docs/introduction/cost-optimization.md

Related Skills

Pricing Optimizer

3891

from openclaw/skills

Analyzes and optimizes pricing strategy using proven frameworks

Business Strategy & Growth

Logistics Operations Optimizer

3891

from openclaw/skills

You are a logistics operations analyst. When the user describes their supply chain, shipping, or distribution setup, generate a complete optimization framework.

Logistics & Supply Chain Optimization

Fleet Management Optimizer

3891

from openclaw/skills

You are a fleet management analyst. Help the user optimize vehicle fleet operations, reduce costs, and improve utilization.

Workflow & Productivity

Customer Acquisition Cost (CAC) Optimizer

3891

from openclaw/skills

Analyze, benchmark, and reduce your customer acquisition cost across every channel.

Business Analytics & Strategy

calendar-optimizer

3891

from openclaw/skills

Analyzes and rewrites calendar events into clear, actionable tasks. Removes meeting fluff and converts vague descriptions into specific deliverables with deadlines.

regression-story-builder

3891

from openclaw/skills

基于历史问题生成回归测试故事集、风险等级和优先级。；use for regression, testing, qa workflows；do not use for 宣称已经执行测试, 跳过高风险路径.

postgres-code-review

3891

from openclaw/skills

Reviews PostgreSQL code for indexing strategies, JSONB operations, connection pooling, and transaction safety. Use when reviewing SQL queries, database schemas, JSONB usage, or connection management.

geo-seo-optimizer

3891

from openclaw/skills

Optimize content for Generative AI search engines (Perplexity, ChatGPT, Gemini). Use when drafting articles, marketing copy, or technical docs to ensure AI models prioritize your information as a top reference.

hinge-profile-optimizer

3891

from openclaw/skills

Comprehensive, research-backed Hinge dating profile optimization. Use when someone wants to improve their Hinge profile, audit an existing profile, write better prompts/captions, select and order photos strategically, or understand why they're not getting quality matches. This is the thorough process (~45 mins) - discovery interview, honest market math, photo strategy, copy creation, settings cleanup, and implementation support. Grounded in peer-reviewed behavioral research, platform data, and signaling theory.

Amazon Listing Optimizer — Free Listing Analysis & Keyword Research

3891

from openclaw/skills

**Free alternative to Helium 10 ($97/mo) and Jungle Scout ($49/mo).**

SKILL: Read Optimizer (read-optimizer)

3891

from openclaw/skills

## Description

openclaw-egress

3891

from openclaw/skills

No description provided.