vector-database-engineer

Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar

16 stars

Best use case

vector-database-engineer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar

Teams using vector-database-engineer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/vector-database-engineer/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/vector-database-engineer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/vector-database-engineer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How vector-database-engineer Compares

Feature / Agentvector-database-engineerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Vector Database Engineer

Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similarity search. Use PROACTIVELY for vector search implementation, embedding optimization, or semantic retrieval systems.

## Do not use this skill when

- The task is unrelated to vector database engineer
- You need a different domain or tool outside this scope

## Instructions

- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.

## Capabilities

- Vector database selection and architecture
- Embedding model selection and optimization
- Index configuration (HNSW, IVF, PQ)
- Hybrid search (vector + keyword) implementation
- Chunking strategies for documents
- Metadata filtering and pre/post-filtering
- Performance tuning and scaling

## Use this skill when

- Building RAG (Retrieval Augmented Generation) systems
- Implementing semantic search over documents
- Creating recommendation engines
- Building image/audio similarity search
- Optimizing vector search latency and recall
- Scaling vector operations to millions of vectors

## Workflow

1. Analyze data characteristics and query patterns
2. Select appropriate embedding model
3. Design chunking and preprocessing pipeline
4. Choose vector database and index type
5. Configure metadata schema for filtering
6. Implement hybrid search if needed
7. Optimize for latency/recall tradeoffs
8. Set up monitoring and reindexing strategies

## Best Practices

- Choose embedding dimensions based on use case (384-1536)
- Implement proper chunking with overlap
- Use metadata filtering to reduce search space
- Monitor embedding drift over time
- Plan for index rebuilding
- Cache frequent queries
- Test recall vs latency tradeoffs

Related Skills

llm-application-dev-ai-engineer

16
from diegosouzapw/awesome-omni-skill

Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations. Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications. Use when: the task directly matches ai engineer responsibilities within plugin llm-application-dev. Do not use when: a more specific framework or task-focused skill is clearly a better match.

flow-engineer-rule

16
from diegosouzapw/awesome-omni-skill

Guide for creating persistent AI rules (coding standards, project conventions, file-specific patterns). Use when users want to create a rule, add coding standards, set up project conventions, configure file-specific patterns, or ask about rules placement. Works across IDEs (Cursor, Claude Code, Antigravity, OpenAI Codex, OpenCode).

docker-database

16
from diegosouzapw/awesome-omni-skill

Configure database containers with security, persistence, and health checks

Database Sync

16
from diegosouzapw/awesome-omni-skill

Automate database synchronization, replication, migration, and cross-platform data integration

database-skill

16
from diegosouzapw/awesome-omni-skill

Design and manage relational databases including table creation, migrations, and schema design. Use for database modeling and maintenance.

database-architect

16
from diegosouzapw/awesome-omni-skill

Database design and optimization specialist. Schema design, query optimization, indexing strategies, data modeling, and migration planning for relational and NoSQL databases.

data-engineering-data-pipeline

16
from diegosouzapw/awesome-omni-skill

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

data-engineer

16
from diegosouzapw/awesome-omni-skill

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms. Use PROACTIVELY for data pipeline design, analytics infrastructure, or modern data stack implementation.

context-engineering

16
from diegosouzapw/awesome-omni-skill

Use when designing agent system prompts, optimizing RAG retrieval, or when context is too expensive or slow. Reduces tokens while maintaining quality through strategic positioning and attention-aware design.

Build Your Data Engineering Skill

16
from diegosouzapw/awesome-omni-skill

Create your LLMOps data engineering skill in one prompt, then learn to improve it throughout the chapter

arch-database

16
from diegosouzapw/awesome-omni-skill

DB architecture: relational vs document vs graph vs vector, schema design, indexing, replication, sharding

ai-engineering-skill

16
from diegosouzapw/awesome-omni-skill

Practical guide for building production ML systems based on Chip Huyen's AI Engineering book. Use when users ask about model evaluation, deployment strategies, monitoring, data pipelines, feature engineering, cost optimization, or MLOps. Covers metrics, A/B testing, serving patterns, drift detection, and production best practices.