rag-hybrid-search

Hybrid search combining semantic and keyword retrieval for RAG pipelines. Implement BM25 + dense vector search with fusion strategies.

509 stars

bya5c-ai

View on GitHub Installation ↓

Best use case

rag-hybrid-search is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Hybrid search combining semantic and keyword retrieval for RAG pipelines. Implement BM25 + dense vector search with fusion strategies.

Teams using rag-hybrid-search should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/rag-hybrid-search/SKILL.md --create-dirs "https://raw.githubusercontent.com/a5c-ai/babysitter/main/library/specializations/ai-agents-conversational/skills/rag-hybrid-search/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/rag-hybrid-search/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How rag-hybrid-search Compares

Feature / Agent	rag-hybrid-search	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Hybrid search combining semantic and keyword retrieval for RAG pipelines. Implement BM25 + dense vector search with fusion strategies.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# rag-hybrid-search

Implement hybrid search combining semantic vector retrieval with keyword-based BM25 search for improved RAG pipeline accuracy and recall.

## Overview

Hybrid search addresses the limitations of pure semantic or pure keyword search:
- Semantic search excels at conceptual similarity but may miss exact matches
- Keyword search finds exact terms but lacks semantic understanding
- Hybrid combines both for superior retrieval performance

## Capabilities

### Search Strategies
- Dense vector semantic search (embeddings)
- Sparse vector keyword search (BM25, TF-IDF)
- Hybrid fusion with configurable weighting
- Reciprocal Rank Fusion (RRF) combination

### Retrieval Configuration
- Configure embedding models for dense search
- Tune BM25 parameters (k1, b values)
- Set retrieval limits and thresholds
- Apply metadata filtering

### Ranking & Reranking
- Score normalization across search types
- Weighted score fusion
- Cross-encoder reranking
- MMR (Maximum Marginal Relevance) diversity

### Index Management
- Create and update hybrid indexes
- Batch indexing with progress tracking
- Index optimization and maintenance
- Multi-index federation

## Usage

### Basic Hybrid Search with LangChain

```python
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma
from langchain.retrievers import EnsembleRetriever
from langchain_openai import OpenAIEmbeddings

# Create documents
docs = [...]  # Your document chunks

# Dense retriever (semantic)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

# Sparse retriever (BM25)
bm25_retriever = BM25Retriever.from_documents(docs)
bm25_retriever.k = 5

# Hybrid ensemble
hybrid_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, dense_retriever],
    weights=[0.4, 0.6]  # Adjust based on use case
)

# Query
results = hybrid_retriever.invoke("How do I configure the system?")
```

### Reciprocal Rank Fusion

```python
def reciprocal_rank_fusion(results_lists: list, k: int = 60) -> list:
    """
    Combine multiple ranked lists using RRF.
    k is a constant (typically 60) for smoothing.
    """
    fused_scores = {}

    for results in results_lists:
        for rank, doc in enumerate(results):
            doc_id = doc.metadata.get("id", str(doc.page_content[:50]))
            if doc_id not in fused_scores:
                fused_scores[doc_id] = {"doc": doc, "score": 0}
            fused_scores[doc_id]["score"] += 1 / (k + rank + 1)

    # Sort by fused score
    sorted_docs = sorted(
        fused_scores.values(),
        key=lambda x: x["score"],
        reverse=True
    )

    return [item["doc"] for item in sorted_docs]

# Use with multiple retrievers
semantic_results = dense_retriever.invoke(query)
keyword_results = bm25_retriever.invoke(query)
hybrid_results = reciprocal_rank_fusion([semantic_results, keyword_results])
```

### Pinecone Hybrid Search

```python
from pinecone import Pinecone
from pinecone_text.sparse import BM25Encoder

# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")
index = pc.Index("hybrid-index")

# Prepare sparse encoder
bm25 = BM25Encoder()
bm25.fit(corpus)  # Fit on your document corpus

def hybrid_query(query: str, alpha: float = 0.5, top_k: int = 10):
    """
    Query with hybrid search.
    alpha: weight for dense vectors (1-alpha for sparse)
    """
    # Get dense embedding
    dense_embedding = embeddings.embed_query(query)

    # Get sparse embedding
    sparse_embedding = bm25.encode_queries([query])[0]

    # Hybrid query
    results = index.query(
        vector=dense_embedding,
        sparse_vector=sparse_embedding,
        top_k=top_k,
        include_metadata=True
    )

    return results
```

### Weaviate Hybrid Search

```python
import weaviate

client = weaviate.Client("http://localhost:8080")

def weaviate_hybrid_search(query: str, alpha: float = 0.5, limit: int = 10):
    """
    Weaviate native hybrid search.
    alpha: 0 = pure BM25, 1 = pure vector
    """
    result = (
        client.query
        .get("Document", ["content", "title", "metadata"])
        .with_hybrid(
            query=query,
            alpha=alpha,
            properties=["content", "title"]
        )
        .with_limit(limit)
        .do()
    )

    return result["data"]["Get"]["Document"]
```

## Task Definition

```javascript
const ragHybridSearchTask = defineTask({
  name: 'rag-hybrid-search-setup',
  description: 'Configure hybrid search for RAG pipeline',

  inputs: {
    vectorStore: { type: 'string', required: true },  // 'pinecone', 'weaviate', 'chroma', etc.
    embeddingModel: { type: 'string', default: 'text-embedding-3-small' },
    bm25Params: { type: 'object', default: { k1: 1.5, b: 0.75 } },
    fusionStrategy: { type: 'string', default: 'rrf' },  // 'rrf', 'weighted', 'custom'
    denseWeight: { type: 'number', default: 0.6 },
    topK: { type: 'number', default: 10 }
  },

  outputs: {
    retrieverConfigured: { type: 'boolean' },
    indexStats: { type: 'object' },
    artifacts: { type: 'array' }
  },

  async run(inputs, taskCtx) {
    return {
      kind: 'skill',
      title: `Configure hybrid search with ${inputs.vectorStore}`,
      skill: {
        name: 'rag-hybrid-search',
        context: {
          vectorStore: inputs.vectorStore,
          embeddingModel: inputs.embeddingModel,
          bm25Params: inputs.bm25Params,
          fusionStrategy: inputs.fusionStrategy,
          denseWeight: inputs.denseWeight,
          topK: inputs.topK,
          instructions: [
            'Validate vector store connection and configuration',
            'Set up dense embedding pipeline',
            'Configure BM25/sparse encoding',
            'Implement fusion strategy',
            'Test retrieval quality with sample queries',
            'Document configuration and tuning parameters'
          ]
        }
      },
      io: {
        inputJsonPath: `tasks/${taskCtx.effectId}/input.json`,
        outputJsonPath: `tasks/${taskCtx.effectId}/result.json`
      }
    };
  }
});
```

## Applicable Processes

- rag-pipeline-implementation
- advanced-rag-patterns
- knowledge-base-qa
- vector-database-setup

## External Dependencies

- Vector database (Pinecone, Weaviate, Chroma, Milvus, Qdrant)
- Embedding provider (OpenAI, Cohere, Hugging Face)
- BM25 encoder (rank_bm25, pinecone-text)

## References

- [Claude Context (Zilliz)](https://github.com/zilliztech/claude-context) - Hybrid search MCP
- [MCP Local RAG](https://github.com/shinpr/mcp-local-rag) - Local-first RAG with hybrid search
- [LangChain Anthropic MCP Server](https://glama.ai/mcp/servers/@spencer-life/langchain-anthropic-mcp-server)
- [Pinecone Hybrid Search](https://docs.pinecone.io/docs/hybrid-search)
- [Weaviate Hybrid Search](https://weaviate.io/developers/weaviate/search/hybrid)

## Related Skills

- SK-RAG-001 rag-chunking-strategy
- SK-RAG-004 rag-reranking
- SK-RAG-005 rag-query-transformation
- SK-VDB-001 through SK-VDB-005 (vector database integrations)

## Related Agents

- AG-RAG-001 rag-pipeline-architect
- AG-RAG-003 vector-db-specialist
- AG-RAG-004 retrieval-optimizer

Related Skills

user-research-synthesis

509

from a5c-ai/babysitter

Specialized skill for synthesizing qualitative user research into actionable insights. Analyzes interview transcripts, extracts patterns and themes, identifies pain points, creates affinity diagrams, and generates persona attributes from research data.

specialization-researcher

509

from a5c-ai/babysitter

Research specialization domains, compile references, analyze best practices, and gather comprehensive knowledge for new specialization creation.

research-ethics-irb

509

from a5c-ai/babysitter

Navigate institutional review board processes, informed consent, confidentiality, and ethical considerations in human subjects research

ethnographic-research

509

from a5c-ai/babysitter

Conduct participant observation, fieldwork, immersion, and thick description documentation in diverse cultural settings

research-ethics-irb-navigation

509

from a5c-ai/babysitter

Prepare ethics applications, develop informed consent protocols, and navigate institutional review processes for human subjects research

curatorial-research

509

from a5c-ai/babysitter

Conduct art historical research, provenance investigation, and scholarly analysis to inform exhibitions, acquisitions, and publications using primary and secondary sources

semantic-scholar-search

509

from a5c-ai/babysitter

Academic literature search using Semantic Scholar API for citation-aware paper discovery

elicit-research-assistant

509

from a5c-ai/babysitter

AI-assisted literature review for question-answering over papers and evidence synthesis

pennylane-hybrid-executor

509

from a5c-ai/babysitter

PennyLane integration skill for hybrid quantum-classical machine learning and variational algorithms

arxiv-search-interface

509

from a5c-ai/babysitter

Search and analyze mathematical literature on arXiv

blast-sequence-search

509

from a5c-ai/babysitter

BLAST skill for sequence similarity searching, homology detection, and database querying

market-research-platform

509

from a5c-ai/babysitter

Integration with market research platforms and survey tools for primary and secondary research