storage-debug-instrumentation
Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
Best use case
storage-debug-instrumentation is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "storage-debug-instrumentation" skill to help with this workflow task. Context: Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/storage-debug-instrumentation/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How storage-debug-instrumentation Compares
| Feature / Agent | storage-debug-instrumentation | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Storage Debug Instrumentation ## Purpose Enable rapid diagnosis of storage state, synchronization health, and backend performance bottlenecks by exposing: - Raw article inspection from both PostgreSQL and ChromaDB - Storage drift detection (missing/dangling entries) - Detailed startup timeline breakdown (DB init, cache preload, vector store, RSS refresh) - One-page debug dashboard consolidating all diagnostics ## Scope - Backend: `app/services/startup_metrics.py`, `app/main.py`, `app/vector_store.py`, `app/database.py`, `app/api/routes/debug.py` - Frontend: `frontend/lib/api.ts`, `frontend/app/debug/page.tsx` - No schema changes; purely additive instrumentation and debug routes ## Workflow ### 1. Create startup metrics service **File:** `backend/app/services/startup_metrics.py` - Implement thread-safe `StartupMetrics` class to record phase timings - Expose `record_event(name, started_at, detail, metadata)` for phase capture - Support `add_note(key, value)` for arbitrary annotations - Export singleton `startup_metrics` for app-wide use ### 2. Instrument vector store initialization **File:** `backend/app/vector_store.py` - Import `startup_metrics` - In `VectorStore.__init__()`, wrap initialization with `time.time()` timer - Record event with metadata: `host`, `port`, `collection`, `documents` - Catch connection errors and annotate them ### 3. Instrument FastAPI startup sequence **File:** `backend/app/main.py` - Call `startup_metrics.mark_app_started()` at beginning of `on_startup()` - Wrap each phase (DB init, schedulers, cache preload, RSS refresh, migration) with `record_event()` - Include metadata: `cache_size`, `article_count`, `oldest_article_hours` - Call `startup_metrics.mark_app_completed()` at end - Add app version notes via `add_note()` ### 4. Add database pagination helpers **File:** `backend/app/database.py` - Implement `fetch_articles_page()` to support: - Limit/offset pagination - Optional source filter - Missing-embeddings-only flag - Published date range filters - Sort direction (asc/desc) - Return oldest/newest timestamp bounds - Implement `fetch_article_chroma_mappings()` to return all article→chroma ID mappings for drift analysis ### 5. Add vector store pagination helpers **File:** `backend/app/vector_store.py` - Implement `list_articles(limit, offset)` to return paginated Chroma documents with metadata and previews - Implement `list_all_ids()` to return all stored Chroma IDs for drift detection (used by `/debug/storage/drift`) ### 6. Expose debug API endpoints **File:** `backend/app/api/routes/debug.py` - Add `GET /debug/startup` → returns startup metrics timeline (events + notes) - Add `GET /debug/chromadb/articles` → returns paginated raw Chroma entries with limit/offset - Add `GET /debug/database/articles` → returns paginated Postgres rows with filters (source, embeddings, date range, sort) - Add `GET /debug/storage/drift` → compares Chroma IDs vs Postgres mappings, returns missing/dangling counts + samples ### 7. Add frontend API bindings **File:** `frontend/lib/api.ts` - Export types: `StartupEventMetric`, `StartupMetricsResponse`, `ChromaDebugResponse`, `DatabaseDebugResponse`, `StorageDriftReport` - Export fetchers: `fetchStartupMetrics()`, `fetchChromaDebugArticles()`, `fetchDatabaseDebugArticles()`, `fetchStorageDrift()` - Ensure snake_case→camelCase mapping for response fields ### 8. Build debug dashboard page **File:** `frontend/app/debug/page.tsx` - Create `/debug` route with multi-tab inspection UI - Render startup timeline: phase name, duration, metadata badges (cache size, vectors, migrated records) - Display Chroma browser: paginated table with ID, title, source, preview - Display Postgres browser: paginated table with filters (source, date range, missing-embeddings-only flag) - Display drift report: sample tables for missing-in-chroma and dangling-in-chroma entries - Include summary cards for quick metrics (boot time, total articles, vector count, drift count) ## Implementation checklist - [ ] Create `backend/app/services/startup_metrics.py` - [ ] Instrument `backend/app/vector_store.py::VectorStore.__init__()` - [ ] Instrument `backend/app/main.py::on_startup()` (all phases) - [ ] Add `fetch_articles_page()` and `fetch_article_chroma_mappings()` to `backend/app/database.py` - [ ] Add `list_articles()` and `list_all_ids()` to `backend/app/vector_store.py` - [ ] Add `/debug/startup`, `/debug/chromadb/articles`, `/debug/database/articles`, `/debug/storage/drift` to `backend/app/api/routes/debug.py` - [ ] Add types and fetchers to `frontend/lib/api.ts` - [ ] Create `frontend/app/debug/page.tsx` with dashboard layout - [ ] Run `uvx ruff check backend` → all checks pass - [ ] Test endpoints in curl or Postman to verify response structure ## Verification checklist - [ ] `GET http://localhost:8000/debug/startup` returns valid timeline with events and notes - [ ] `GET http://localhost:8000/debug/chromadb/articles?limit=50&offset=0` returns paginated Chroma docs - [ ] `GET http://localhost:8000/debug/database/articles?source=bbc&missing_embeddings_only=false` filters correctly - [ ] `GET http://localhost:8000/debug/storage/drift` compares counts and returns drift samples - [ ] `http://localhost:3000/debug` loads without errors and displays all four sections - [ ] Refresh button triggers all four API calls in parallel - [ ] Pagination controls update limit/offset correctly - [ ] Database filters (source, date range) update and refresh data - [ ] Startup timeline shows non-zero phase durations if backend just started ## Future enhancements - Streaming startup metrics via SSE (live tail during boot) - Export startup report as JSON/CSV for performance tracking over time - Automated drift alerts (post to Slack/email if dangling > threshold) - Performance graphs (startup time trends, article throughput) - Sync-on-demand action (button to force vector store refresh for missing articles)
Related Skills
error-diagnostics-smart-debug
Use when working with error diagnostics smart debug
error-debugging-multi-agent-review
Use when working with error debugging multi agent review
error-debugging-error-trace
You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging, and ensure teams can quickly identify and resolve production issues.
error-debugging-error-analysis
You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.
distributed-debugging-debug-trace
You are a debugging expert specializing in setting up comprehensive debugging environments, distributed tracing, and diagnostic tools. Configure debugging workflows, implement tracing solutions, and establish troubleshooting practices for development and production environments.
debugging-toolkit-smart-debug
Use when working with debugging toolkit smart debug
debugging-strategies
Master systematic debugging techniques, profiling tools, and root cause analysis to efficiently track down bugs across any codebase or technology stack. Use when investigating bugs, performance issues, or unexpected behavior.
debugger
Debugging specialist for errors, test failures, and unexpected behavior. Use proactively when encountering any issues.
azure-storage-queue-ts
Azure Queue Storage JavaScript/TypeScript SDK (@azure/storage-queue) for message queue operations. Use for sending, receiving, peeking, and deleting messages in queues. Supports visibility timeout, message encoding, and batch operations. Triggers: "queue storage", "@azure/storage-queue", "QueueServiceClient", "QueueClient", "send message", "receive message", "dequeue", "visibility timeout".
azure-storage-queue-py
Azure Queue Storage SDK for Python. Use for reliable message queuing, task distribution, and asynchronous processing. Triggers: "queue storage", "QueueServiceClient", "QueueClient", "message queue", "dequeue".
azure-storage-file-share-ts
Azure File Share JavaScript/TypeScript SDK (@azure/storage-file-share) for SMB file share operations. Use for creating shares, managing directories, uploading/downloading files, and handling file metadata. Supports Azure Files SMB protocol scenarios. Triggers: "file share", "@azure/storage-file-share", "ShareServiceClient", "ShareClient", "SMB", "Azure Files".
azure-storage-file-share-py
Azure Storage File Share SDK for Python. Use for SMB file shares, directories, and file operations in the cloud. Triggers: "azure-storage-file-share", "ShareServiceClient", "ShareClient", "file share", "SMB".