caching

Caching strategies, invalidation, eviction policies, HTTP caching, distributed caching, and anti-patterns. Use when designing cache layers, choosing eviction policies, debugging stale data, or optimizing read-heavy workloads.

7 stars

bywpank

View on GitHub Installation ↓

Best use case

caching is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using caching should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/caching/SKILL.md --create-dirs "https://raw.githubusercontent.com/wpank/ai/main/skills/api/caching/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/caching/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How caching Compares

Feature / Agent	caching	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Caching Patterns

> A well-placed cache is the cheapest way to buy speed. A misplaced cache is the most expensive way to buy bugs.

## Cache Strategies

| Strategy | How It Works | When to Use |
|----------|-------------|-------------|
| **Cache-Aside (Lazy)** | App checks cache → miss → reads DB → writes to cache | **Default choice** — general purpose |
| **Read-Through** | Cache fetches from DB on miss automatically | ORM-integrated caching, CDN origin fetch |
| **Write-Through** | Writes go to cache AND DB synchronously | Read-heavy with strong consistency |
| **Write-Behind** | Writes go to cache, async flush to DB | High write throughput, eventual consistency OK |
| **Refresh-Ahead** | Cache proactively refreshes before expiry | Predictable access patterns, low-latency critical |

```
Cache-Aside Flow:

  App ──► Cache ──► HIT? ──► Return data
              │
              ▼ MISS
          Read DB ──► Store in Cache ──► Return data
```


## Installation

### OpenClaw / Moltbot / Clawbot

```bash
npx clawhub@latest install caching
```


---

## Cache Invalidation

| Method | Consistency | When to Use |
|--------|-------------|-------------|
| **TTL-based** | Eventual (up to TTL) | Simple data, acceptable staleness |
| **Event-based** | Strong (near real-time) | Inventory, profile updates |
| **Version-based** | Strong | Static assets, API responses, config |
| **Tag-based** | Strong | CMS content, category-based purging |

### TTL Guidelines

| Data Type | TTL | Rationale |
|-----------|-----|-----------|
| Static assets (CSS/JS/images) | 1 year + cache-busting hash | Immutable by filename |
| API config / feature flags | 30–60 seconds | Fast propagation needed |
| User profile data | 5–15 minutes | Tolerable staleness |
| Product catalog | 1–5 minutes | Balance freshness vs load |
| Session data | Match session timeout | Security requirement |

---

## HTTP Caching

### Cache-Control Directives

| Directive | Meaning |
|-----------|---------|
| `max-age=N` | Cache for N seconds |
| `s-maxage=N` | CDN/shared cache max age (overrides max-age) |
| `no-cache` | Must revalidate before using cached copy |
| `no-store` | Never cache anywhere |
| `must-revalidate` | Once stale, must revalidate |
| `private` | Only browser can cache, not CDN |
| `public` | Any cache can store |
| `immutable` | Content will never change (within max-age) |
| `stale-while-revalidate=N` | Serve stale for N seconds while fetching fresh |

### Common Recipes

```
# Immutable static assets (hashed filenames)
Cache-Control: public, max-age=31536000, immutable

# API response, CDN-cached, background refresh
Cache-Control: public, s-maxage=60, stale-while-revalidate=300

# Personalized data, browser-only
Cache-Control: private, max-age=0, must-revalidate
ETag: "abc123"

# Never cache (auth tokens, sensitive data)
Cache-Control: no-store
```

### Conditional Requests

| Mechanism | Request Header | Response Header | How It Works |
|-----------|---------------|-----------------|-------------|
| **ETag** | `If-None-Match: "abc"` | `ETag: "abc"` | Hash-based — 304 if match |
| **Last-Modified** | `If-Modified-Since: <date>` | `Last-Modified: <date>` | Date-based — 304 if unchanged |

Prefer ETag over Last-Modified — ETags detect content changes regardless of timestamp granularity.

---

## Application Caching

| Solution | Speed | Shared Across Processes | When to Use |
|----------|-------|------------------------|-------------|
| **In-memory LRU** | Fastest | No | Single-process, bounded memory, hot data |
| **Redis** | Sub-ms (network) | Yes | **Production default** — TTL, pub/sub, persistence |
| **Memcached** | Sub-ms (network) | Yes | Simple key-value at extreme scale |
| **SQLite** | Fast (disk) | No | Embedded apps, edge caching |

### Redis vs Memcached

| Feature | Redis | Memcached |
|---------|-------|-----------|
| Data structures | Strings, hashes, lists, sets, sorted sets | Strings only |
| Persistence | AOF, RDB snapshots | None |
| Pub/Sub | Yes | No |
| Max value size | 512 MB | 1 MB |
| **Verdict** | **Default choice** | Pure cache at extreme scale |

---

## Distributed Caching

| Concern | Solution |
|---------|----------|
| **Partitioning** | Consistent hashing — minimal reshuffling on node changes |
| **Replication** | Primary-replica — writes to primary, reads from replicas |
| **Failover** | Redis Sentinel or Cluster auto-failover |

**Rule of thumb:** 3 primaries + 3 replicas minimum for production Redis Cluster.

---

## Cache Eviction Policies

| Policy | How It Works | When to Use |
|--------|-------------|-------------|
| **LRU** | Evicts least recently accessed | **Default** — general purpose |
| **LFU** | Evicts least frequently accessed | Skewed popularity distributions |
| **FIFO** | Evicts oldest entry | Simple, time-ordered data |
| **TTL** | Evicts after fixed duration | Data with known freshness window |

> Redis default is `noeviction`. Set `maxmemory-policy` to `allkeys-lru` or `volatile-lru` for production.

---

## Caching Layers

```
Browser Cache → CDN → Load Balancer → App Cache → DB Cache → Database
```

| Layer | What to Cache | Invalidation |
|-------|--------------|--------------|
| **Browser** | Static assets, API responses | Versioned URLs, Cache-Control |
| **CDN** | Static files, public API responses | Purge API, surrogate keys |
| **Application** | Computed results, DB queries, external API | Event-driven, TTL |
| **Database** | Query plans, buffer pool, materialized views | `ANALYZE`, manual refresh |

---

## Cache Stampede Prevention

When a hot key expires, hundreds of requests simultaneously hit the database.

| Technique | How It Works |
|-----------|-------------|
| **Mutex / Lock** | First request locks, fetches, populates; others wait |
| **Probabilistic early expiration** | Random chance of refreshing before TTL |
| **Request coalescing** | Deduplicate in-flight requests for same key |
| **Stale-while-revalidate** | Serve stale, refresh asynchronously |

---

## Cache Warming

| Strategy | When to Use |
|----------|-------------|
| **On-deploy warm-up** | Predictable key set, latency-sensitive |
| **Background job** | Reports, dashboards, catalog data |
| **Shadow traffic** | Cache migration, new infrastructure |
| **Priority-based** | Limited warm-up time budget |

> **Cold start impact:** A full cache flush can increase DB load 10–100x. Always warm gradually or use stale-while-revalidate.

---

## Monitoring

| Metric | Healthy Range | Action if Unhealthy |
|--------|--------------|---------------------|
| **Hit rate** | > 90% | Low → cache too small, wrong TTL, bad key design |
| **Eviction rate** | Near 0 steady state | High → increase memory or tune policy |
| **Latency (p99)** | < 1ms (Redis) | High → network issue, large values, hot key |
| **Memory usage** | < 80% of max | Approaching max → scale up or tune eviction |

---

## NEVER Do

1. **NEVER cache without a TTL or invalidation plan** — data rots; every entry needs an expiry path
2. **NEVER treat cache as durable storage** — caches evict, crash, and restart; always fall back to source of truth
3. **NEVER cache sensitive data (tokens, PII) without encryption** — cache breaches expose everything in plaintext
4. **NEVER ignore cache stampede on hot keys** — one expired popular key can take down your database
5. **NEVER use unbounded in-memory caches in production** — memory grows until OOM-killed
6. **NEVER cache mutable data with `immutable` Cache-Control** — browsers will never re-fetch
7. **NEVER skip monitoring hit/miss rates** — you won't know if your cache is helping or hurting

Related Skills

schema-markup

from wpank/ai

Add, fix, or optimize schema markup and structured data. Use when the user mentions schema markup, structured data, JSON-LD, rich snippets, schema.org, FAQ schema, product schema, review schema, or breadcrumb schema.

prompt-engineering

from wpank/ai

Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, improving LLM outputs, designing production prompt templates, or building AI-powered features.

professional-communication

from wpank/ai

Write effective professional messages for software teams. Use when drafting emails, Slack/Teams messages, meeting agendas, status updates, or translating technical concepts for non-technical audiences. Triggers on email, slack, teams, message, meeting agenda, status update, stakeholder communication, escalation, jargon translation.

persona-docs

from wpank/ai

Create persona documentation for a product or codebase. Use when asked to create persona docs, document target users, define user journeys, document onboarding flows, or when starting a new product and needing to define its audience. Persona docs should be the first documentation created for any product.

mermaid-diagrams

from wpank/ai

Create software diagrams using Mermaid syntax. Use when users need to create, visualize, or document software through diagrams including class diagrams, sequence diagrams, flowcharts, ERDs, C4 architecture diagrams, state diagrams, git graphs, and other diagram types. Triggers include requests to diagram, visualize, model, map out, or show the flow of a system.

game-changing-features

from wpank/ai

Find 10x product opportunities and high-leverage improvements. Use when the user wants strategic product thinking, mentions 10x, wants to find high-impact features, or asks what would make a product dramatically more valuable.

clear-writing

from wpank/ai

Write clear, concise prose for humans — documentation, READMEs, API docs, commit messages, error messages, UI text, reports, and explanations. Combines Strunk's rules for clearer prose with technical documentation patterns, structure templates, and review checklists.

brainstorming

from wpank/ai

Explore ideas before implementation through collaborative dialogue. Use before any creative work — creating features, building components, adding functionality, or modifying behavior. Turns ideas into fully formed designs and specs through structured conversation.

Article Illustrator

from wpank/ai

When the user wants to add illustrations to an article or blog post. Triggers on: "illustrate article", "add images to article", "generate illustrations", "article images", or requests to visually enhance written content. Analyzes article structure, identifies positions for visual aids, and generates illustrations using a Type x Style two-dimension approach.

subagent-driven-development

from wpank/ai

Execute implementation plans by dispatching a fresh subagent per task with two-stage review (spec compliance then code quality). Use when you have an implementation plan with mostly independent tasks and want high-quality, fast iteration within a single session.

skill-judge

from wpank/ai

Evaluate Agent Skill quality against official specifications. Use when reviewing SKILL.md files, auditing skill packages, improving skill design, or checking if a skill follows best practices. Provides 8-dimension scoring (120 points) with actionable improvements. Triggers on review skill, evaluate skill, audit skill, improve skill, skill quality, SKILL.md review.

skill-creator

from wpank/ai

WHAT: Guide for creating effective AI agent skills - modular packages that extend Claude's capabilities with specialized knowledge, workflows, and tools. WHEN: User wants to create, write, author, or update a skill. User asks about skill structure, SKILL.md format, or how to package domain knowledge for AI agents. KEYWORDS: "create a skill", "make a skill", "new skill", "skill template", "SKILL.md", "agent skill", "write a skill", "skill structure", "package a skill"