agentdb-vector-search-optimization

AgentDB Vector Search Optimization operates on 3 fundamental principles:

28 stars

Best use case

agentdb-vector-search-optimization is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

AgentDB Vector Search Optimization operates on 3 fundamental principles:

Teams using agentdb-vector-search-optimization should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/agentdb-vector-search-optimization/SKILL.md --create-dirs "https://raw.githubusercontent.com/DNYoussef/context-cascade/main/skills/platforms/agentdb-vector-search-optimization/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/agentdb-vector-search-optimization/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How agentdb-vector-search-optimization Compares

Feature / Agent	agentdb-vector-search-optimization	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

AgentDB Vector Search Optimization operates on 3 fundamental principles:

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AgentDB Vector Search Optimization



---

## LIBRARY-FIRST PROTOCOL (MANDATORY)

**Before writing ANY code, you MUST check:**

### Step 1: Library Catalog
- Location: `.claude/library/catalog.json`
- If match >70%: REUSE or ADAPT

### Step 2: Patterns Guide
- Location: `.claude/docs/inventories/LIBRARY-PATTERNS-GUIDE.md`
- If pattern exists: FOLLOW documented approach

### Step 3: Existing Projects
- Location: `D:\Projects\*`
- If found: EXTRACT and adapt

### Decision Matrix
| Match | Action |
|-------|--------|
| Library >90% | REUSE directly |
| Library 70-90% | ADAPT minimally |
| Pattern exists | FOLLOW pattern |
| In project | EXTRACT |
| No match | BUILD (add to library after) |

---

## Overview

Optimize AgentDB performance with quantization (4-32x memory reduction), HNSW indexing (150x faster search), caching, and batch operations for scaling to millions of vectors.

## SOP Framework: 5-Phase Optimization

### Phase 1: Baseline Performance (1 hour)
- Measure current metrics (latency, throughput, memory)
- Identify bottlenecks
- Set optimization targets

### Phase 2: Apply Quantization (1-2 hours)
- Configure product quantization
- Train codebooks
- Apply compression
- Validate accuracy

### Phase 3: Implement HNSW Indexing (1-2 hours)
- Build HNSW index
- Tune parameters (M, efConstruction, efSearch)
- Benchmark speedup

### Phase 4: Configure Caching (1 hour)
- Implement query cache
- Set TTL and eviction policies
- Monitor hit rates

### Phase 5: Benchmark Results (1-2 hours)
- Run comprehensive benchmarks
- Compare before/after
- Validate improvements

## Quick Start

```typescript
import { AgentDB, Quantization, QueryCache } from 'agentdb-optimization';

const db = new AgentDB({ name: 'optimized-db', dimensions: 1536 });

// Quantization (4x memory reduction)
const quantizer = new Quantization({
  method: 'product-quantization',
  compressionRatio: 4
});
await db.applyQuantization(quantizer);

// HNSW indexing (150x speedup)
await db.createIndex({
  type: 'hnsw',
  params: { M: 16, efConstruction: 200 }
});

// Caching
db.setCache(new QueryCache({
  maxSize: 10000,
  ttl: 3600000
}));
```

## Optimization Techniques

### Quantization
- **Product Quantization**: 4-8x compression
- **Scalar Quantization**: 2-4x compression
- **Binary Quantization**: 32x compression

### Indexing
- **HNSW**: 150x faster, high accuracy
- **IVF**: Fast, partitioned search
- **LSH**: Approximate search

### Caching
- **Query Cache**: LRU eviction
- **Result Cache**: TTL-based
- **Embedding Cache**: Reuse embeddings

## Success Metrics

- Memory reduction: 4-32x
- Search speedup: 150x
- Accuracy maintained: > 95%
- Cache hit rate: > 70%

## MCP Requirements

This skill operates using AgentDB's npm package and API only. No additional MCP servers required.

All AgentDB optimization operations are performed through:
- npm CLI: `npx agentdb@latest`
- TypeScript/JavaScript API: `import { AgentDB, Quantization, QueryCache } from 'agentdb-optimization'`

## Additional Resources

- Full docs: SKILL.md
- AgentDB Optimization: https://agentdb.dev/docs/optimization

## Core Principles

AgentDB Vector Search Optimization operates on 3 fundamental principles:

### Principle 1: Quantization - Trade Negligible Accuracy for Massive Memory Reduction

Vector databases face a fundamental constraint: high-dimensional embeddings (768-1536 dimensions) consume enormous memory at scale. Quantization techniques compress vectors by 4-32x through codebook encoding, enabling systems to hold millions of vectors in memory while maintaining 95%+ accuracy.

In practice:
- Apply product quantization (4-8x compression) for production workloads requiring high accuracy
- Use scalar quantization (2-4x compression) when exact distances matter for ranking
- Deploy binary quantization (32x compression) for massive-scale approximate search where recall > precision

### Principle 2: HNSW Indexing - Logarithmic Search Instead of Linear Scan

Brute-force vector search scales O(n) - doubling vectors doubles search time. HNSW (Hierarchical Navigable Small World) indexes create multi-layer graphs that enable O(log n) search, delivering 150x speedups with tunable accuracy trade-offs through the efSearch parameter.

In practice:
- Build HNSW indexes with M=16 (connections per node) for balanced performance and recall
- Set efConstruction=200 during index building to ensure high-quality graph topology
- Tune efSearch=50-200 at query time to balance speed vs accuracy (higher = slower but more accurate)

### Principle 3: Caching - Exploit Query Locality for 10x Speedup

Real-world vector search exhibits extreme query locality - users repeatedly search similar queries, recommendation systems re-rank same candidates, RAG systems revisit common documents. LRU caches with 70%+ hit rates eliminate 70% of expensive vector computations.

In practice:
- Implement query result caching with TTL=1 hour to capture repeated searches
- Cache embedding computations for frequently accessed text (document titles, product names)
- Use cache key hashing (xxhash64) to handle high-dimensional query vectors efficiently

## Common Anti-Patterns

| Anti-Pattern | Problem | Solution |
|--------------|---------|----------|
| **Premature Quantization - Apply Compression Before Understanding Workload** | Blindly quantizing to 32x compression destroys accuracy for high-precision tasks (medical diagnosis, financial fraud detection). Optimizing the wrong metric creates worse outcomes. | Establish baseline metrics FIRST (Phase 1): measure unoptimized latency, throughput, memory, accuracy. Then apply quantization incrementally (4x -> 8x -> 16x) while validating accuracy > 95% at each step. |
| **Index Amnesia - Rebuild HNSW on Every Query** | HNSW indexes take minutes to build (efConstruction=200 for 1M vectors). Rebuilding on every query eliminates all speedup benefits. | Build indexes ONCE during initialization or batch updates. Persist indexes to disk. Use incremental updates for new vectors instead of full rebuilds. |
| **Cache Thrashing - TTL Too Short or Cache Too Small** | Setting TTL=60 seconds evicts results before reuse. Tiny cache (1000 entries) causes constant evictions in high-traffic systems. | Set TTL based on query frequency (1 hour for typical workloads, 24 hours for stable datasets). Size cache to capture 70%+ of traffic (10K-100K entries typical). Monitor hit rate and adjust. |

## Conclusion

AgentDB Vector Search Optimization unlocks production-scale performance for vector databases through three complementary techniques: quantization (4-32x memory reduction), HNSW indexing (150x search speedup), and caching (70%+ hit rate efficiency). The 5-phase SOP guides systematic optimization from baseline measurement through quantization, indexing, caching, and comprehensive benchmarking, ensuring improvements are validated with hard metrics rather than assumptions.

This skill is critical when scaling vector search beyond toy datasets to millions of vectors, reducing infrastructure costs by 75%+ through memory compression, or achieving sub-10ms latency requirements for real-time applications like recommendation engines and search APIs. The techniques are complementary, not competing - quantization reduces memory footprint enabling larger in-memory indexes, HNSW accelerates the search operation itself, and caching eliminates redundant computation for common queries.

The key insight is performance optimization requires measurement-driven iteration. Blindly applying "best practices" without understanding your workload's query patterns, accuracy requirements, and scale constraints leads to wasted effort or degraded quality. By establishing baselines first, applying optimizations incrementally, and validating improvements at each step, you achieve dramatic performance gains while maintaining the accuracy guarantees your application demands. The result is vector search systems that scale to billions of vectors while remaining responsive enough for interactive user experiences.

Related Skills

reasoningbank-with-agentdb

from DNYoussef/context-cascade

Implement ReasoningBank adaptive learning with AgentDBs 150x faster vector database. Includes trajectory tracking, verdict judgment, memory distillation, and pattern recognition. Use when building self-learning agents, optimizing decision-making, or implementing experience replay systems.

reasoningbank-adaptive-learning-with-agentdb

from DNYoussef/context-cascade

---

gemini-research

from DNYoussef/context-cascade

Use Gemini CLI for research with Google Search grounding and 1M token context

agentdb-semantic-vector-search

from DNYoussef/context-cascade

---

agentdb-reinforcement-learning-training

from DNYoussef/context-cascade

AgentDB Reinforcement Learning Training operates on 3 fundamental principles:

agentdb-persistent-memory-patterns

from DNYoussef/context-cascade

AgentDB Persistent Memory Patterns operates on 3 fundamental principles:

agentdb-performance-optimization

from DNYoussef/context-cascade

Apply quantization to reduce memory by 4-32x. Enable HNSW indexing for 150x faster search. Configure caching strategies and implement batch operations. Use when optimizing memory usage, improving search speed, or scaling to millions of vectors. Deploy these optimizations to achieve 12,500x performance gains.

agentdb-learning-plugins

from DNYoussef/context-cascade

Create AI learning plugins using AgentDBs 9 reinforcement learning algorithms. Train Decision Transformer, Q-Learning, SARSA, and Actor-Critic models. Deploy these plugins to build self-learning agents, implement RL workflows, and optimize agent behavior through experience. Apply offline RL for safe learning from logged data.

agentdb-advanced-features

from DNYoussef/context-cascade

Master advanced AgentDB features including QUIC synchronization, multi-database management, custom distance metrics, hybrid search, and distributed systems integration. Use when building distributed AI systems, multi-agent coordination, or advanced vector search applications.

/============================================================================/

from DNYoussef/context-cascade

/* SKILL SKILL :: VERILINGUA x VERIX EDITION */

web-scraping

from DNYoussef/context-cascade

Structured data extraction from web pages using claude-in-chrome MCP with sequential-thinking planning. Focus on READ operations, data transformation, and pagination handling for multi-page extraction.

visual-testing

from DNYoussef/context-cascade

Screenshot-based visual comparison and regression testing using claude-in-chrome MCP. Captures, compares, and validates UI states to detect layout shifts, visual bugs, and design regressions across viewports.

agentdb-vector-search-optimization

Best use case

When to use this skill

When not to use this skill

Installation

How agentdb-vector-search-optimization Compares

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

SKILL.md Source

Related Skills

reasoningbank-with-agentdb

reasoningbank-adaptive-learning-with-agentdb

gemini-research

agentdb-semantic-vector-search

agentdb-reinforcement-learning-training

agentdb-persistent-memory-patterns

agentdb-performance-optimization

agentdb-learning-plugins

agentdb-advanced-features

/*============================================================================*/

web-scraping

visual-testing

/============================================================================/