storage-debug-instrumentation

Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.

242 stars

byaiskillstore

View on GitHub Installation ↓

Best use case

storage-debug-instrumentation is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "storage-debug-instrumentation" skill to help with this workflow task. Context: Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/storage-debug-instrumentation/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/benderfendor/storage-debug-instrumentation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/storage-debug-instrumentation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How storage-debug-instrumentation Compares

Feature / Agent	storage-debug-instrumentation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Storage Debug Instrumentation

## Purpose
Enable rapid diagnosis of storage state, synchronization health, and backend performance bottlenecks by exposing:
- Raw article inspection from both PostgreSQL and ChromaDB
- Storage drift detection (missing/dangling entries)
- Detailed startup timeline breakdown (DB init, cache preload, vector store, RSS refresh)
- One-page debug dashboard consolidating all diagnostics

## Scope
- Backend: `app/services/startup_metrics.py`, `app/main.py`, `app/vector_store.py`, `app/database.py`, `app/api/routes/debug.py`
- Frontend: `frontend/lib/api.ts`, `frontend/app/debug/page.tsx`
- No schema changes; purely additive instrumentation and debug routes

## Workflow

### 1. Create startup metrics service
**File:** `backend/app/services/startup_metrics.py`
- Implement thread-safe `StartupMetrics` class to record phase timings
- Expose `record_event(name, started_at, detail, metadata)` for phase capture
- Support `add_note(key, value)` for arbitrary annotations
- Export singleton `startup_metrics` for app-wide use

### 2. Instrument vector store initialization
**File:** `backend/app/vector_store.py`
- Import `startup_metrics`
- In `VectorStore.__init__()`, wrap initialization with `time.time()` timer
- Record event with metadata: `host`, `port`, `collection`, `documents`
- Catch connection errors and annotate them

### 3. Instrument FastAPI startup sequence
**File:** `backend/app/main.py`
- Call `startup_metrics.mark_app_started()` at beginning of `on_startup()`
- Wrap each phase (DB init, schedulers, cache preload, RSS refresh, migration) with `record_event()`
- Include metadata: `cache_size`, `article_count`, `oldest_article_hours`
- Call `startup_metrics.mark_app_completed()` at end
- Add app version notes via `add_note()`

### 4. Add database pagination helpers
**File:** `backend/app/database.py`
- Implement `fetch_articles_page()` to support:
  - Limit/offset pagination
  - Optional source filter
  - Missing-embeddings-only flag
  - Published date range filters
  - Sort direction (asc/desc)
  - Return oldest/newest timestamp bounds
- Implement `fetch_article_chroma_mappings()` to return all article→chroma ID mappings for drift analysis

### 5. Add vector store pagination helpers
**File:** `backend/app/vector_store.py`
- Implement `list_articles(limit, offset)` to return paginated Chroma documents with metadata and previews
- Implement `list_all_ids()` to return all stored Chroma IDs for drift detection (used by `/debug/storage/drift`)

### 6. Expose debug API endpoints
**File:** `backend/app/api/routes/debug.py`
- Add `GET /debug/startup` → returns startup metrics timeline (events + notes)
- Add `GET /debug/chromadb/articles` → returns paginated raw Chroma entries with limit/offset
- Add `GET /debug/database/articles` → returns paginated Postgres rows with filters (source, embeddings, date range, sort)
- Add `GET /debug/storage/drift` → compares Chroma IDs vs Postgres mappings, returns missing/dangling counts + samples

### 7. Add frontend API bindings
**File:** `frontend/lib/api.ts`
- Export types: `StartupEventMetric`, `StartupMetricsResponse`, `ChromaDebugResponse`, `DatabaseDebugResponse`, `StorageDriftReport`
- Export fetchers: `fetchStartupMetrics()`, `fetchChromaDebugArticles()`, `fetchDatabaseDebugArticles()`, `fetchStorageDrift()`
- Ensure snake_case→camelCase mapping for response fields

### 8. Build debug dashboard page
**File:** `frontend/app/debug/page.tsx`
- Create `/debug` route with multi-tab inspection UI
- Render startup timeline: phase name, duration, metadata badges (cache size, vectors, migrated records)
- Display Chroma browser: paginated table with ID, title, source, preview
- Display Postgres browser: paginated table with filters (source, date range, missing-embeddings-only flag)
- Display drift report: sample tables for missing-in-chroma and dangling-in-chroma entries
- Include summary cards for quick metrics (boot time, total articles, vector count, drift count)

## Implementation checklist
- [ ] Create `backend/app/services/startup_metrics.py`
- [ ] Instrument `backend/app/vector_store.py::VectorStore.__init__()`
- [ ] Instrument `backend/app/main.py::on_startup()` (all phases)
- [ ] Add `fetch_articles_page()` and `fetch_article_chroma_mappings()` to `backend/app/database.py`
- [ ] Add `list_articles()` and `list_all_ids()` to `backend/app/vector_store.py`
- [ ] Add `/debug/startup`, `/debug/chromadb/articles`, `/debug/database/articles`, `/debug/storage/drift` to `backend/app/api/routes/debug.py`
- [ ] Add types and fetchers to `frontend/lib/api.ts`
- [ ] Create `frontend/app/debug/page.tsx` with dashboard layout
- [ ] Run `uvx ruff check backend` → all checks pass
- [ ] Test endpoints in curl or Postman to verify response structure

## Verification checklist
- [ ] `GET http://localhost:8000/debug/startup` returns valid timeline with events and notes
- [ ] `GET http://localhost:8000/debug/chromadb/articles?limit=50&offset=0` returns paginated Chroma docs
- [ ] `GET http://localhost:8000/debug/database/articles?source=bbc&missing_embeddings_only=false` filters correctly
- [ ] `GET http://localhost:8000/debug/storage/drift` compares counts and returns drift samples
- [ ] `http://localhost:3000/debug` loads without errors and displays all four sections
- [ ] Refresh button triggers all four API calls in parallel
- [ ] Pagination controls update limit/offset correctly
- [ ] Database filters (source, date range) update and refresh data
- [ ] Startup timeline shows non-zero phase durations if backend just started

## Future enhancements
- Streaming startup metrics via SSE (live tail during boot)
- Export startup report as JSON/CSV for performance tracking over time
- Automated drift alerts (post to Slack/email if dangling > threshold)
- Performance graphs (startup time trends, article throughput)
- Sync-on-demand action (button to force vector store refresh for missing articles)

Related Skills

error-diagnostics-smart-debug

242

from aiskillstore/marketplace

Use when working with error diagnostics smart debug

error-debugging-multi-agent-review

242

from aiskillstore/marketplace

Use when working with error debugging multi agent review

error-debugging-error-trace

242

from aiskillstore/marketplace

You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging, and ensure teams can quickly identify and resolve production issues.

error-debugging-error-analysis

242

from aiskillstore/marketplace

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.

distributed-debugging-debug-trace

242

from aiskillstore/marketplace

You are a debugging expert specializing in setting up comprehensive debugging environments, distributed tracing, and diagnostic tools. Configure debugging workflows, implement tracing solutions, and establish troubleshooting practices for development and production environments.

debugging-toolkit-smart-debug

242

from aiskillstore/marketplace

Use when working with debugging toolkit smart debug

debugging-strategies

242

from aiskillstore/marketplace

Master systematic debugging techniques, profiling tools, and root cause analysis to efficiently track down bugs across any codebase or technology stack. Use when investigating bugs, performance issues, or unexpected behavior.

debugger

242

from aiskillstore/marketplace

Debugging specialist for errors, test failures, and unexpected behavior. Use proactively when encountering any issues.

azure-storage-queue-ts

242

from aiskillstore/marketplace

Azure Queue Storage JavaScript/TypeScript SDK (@azure/storage-queue) for message queue operations. Use for sending, receiving, peeking, and deleting messages in queues. Supports visibility timeout, message encoding, and batch operations. Triggers: "queue storage", "@azure/storage-queue", "QueueServiceClient", "QueueClient", "send message", "receive message", "dequeue", "visibility timeout".

azure-storage-queue-py

242

from aiskillstore/marketplace

Azure Queue Storage SDK for Python. Use for reliable message queuing, task distribution, and asynchronous processing. Triggers: "queue storage", "QueueServiceClient", "QueueClient", "message queue", "dequeue".

azure-storage-file-share-ts

242

from aiskillstore/marketplace

Azure File Share JavaScript/TypeScript SDK (@azure/storage-file-share) for SMB file share operations. Use for creating shares, managing directories, uploading/downloading files, and handling file metadata. Supports Azure Files SMB protocol scenarios. Triggers: "file share", "@azure/storage-file-share", "ShareServiceClient", "ShareClient", "SMB", "Azure Files".

azure-storage-file-share-py

242

from aiskillstore/marketplace

Azure Storage File Share SDK for Python. Use for SMB file shares, directories, and file operations in the cloud. Triggers: "azure-storage-file-share", "ShareServiceClient", "ShareClient", "file share", "SMB".