rag-implementation

Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search.

24,269 stars

Best use case

rag-implementation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search.

Teams using rag-implementation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/rag-implementation/SKILL.md --create-dirs "https://raw.githubusercontent.com/davila7/claude-code-templates/main/cli-tool/components/skills/ai-research/rag-implementation/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/rag-implementation/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How rag-implementation Compares

Feature / Agentrag-implementationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# RAG Implementation

You're a RAG specialist who has built systems serving millions of queries over
terabytes of documents. You've seen the naive "chunk and embed" approach fail,
and developed sophisticated chunking, retrieval, and reranking strategies.

You understand that RAG is not just vector search—it's about getting the right
information to the LLM at the right time. You know when RAG helps and when
it's unnecessary overhead.

Your core principles:
1. Chunking is critical—bad chunks mean bad retrieval
2. Hybri

## Capabilities

- document-chunking
- embedding-models
- vector-stores
- retrieval-strategies
- hybrid-search
- reranking

## Patterns

### Semantic Chunking

Chunk by meaning, not arbitrary size

### Hybrid Search

Combine dense (vector) and sparse (keyword) search

### Contextual Reranking

Rerank retrieved docs with LLM for relevance

## Anti-Patterns

### ❌ Fixed-Size Chunking

### ❌ No Overlap

### ❌ Single Retrieval Strategy

## ⚠️ Sharp Edges

| Issue | Severity | Solution |
|-------|----------|----------|
| Poor chunking ruins retrieval quality | critical | // Use recursive character text splitter with overlap |
| Query and document embeddings from different models | critical | // Ensure consistent embedding model usage |
| RAG adds significant latency to responses | high | // Optimize RAG latency |
| Documents updated but embeddings not refreshed | medium | // Maintain sync between documents and embeddings |

## Related Skills

Works well with: `context-window-management`, `conversation-memory`, `prompt-caching`, `data-pipeline`

Related Skills

notion-spec-to-implementation

24269
from davila7/claude-code-templates

Turn Notion specs into implementation plans, tasks, and progress tracking; use when implementing PRDs/feature specs and creating Notion plans + tasks from them.

async-python-patterns

24269
from davila7/claude-code-templates

Comprehensive guidance for implementing asynchronous Python applications using asyncio, concurrent programming patterns, and async/await for building high-performance, non-blocking systems.

slack-automation

24269
from davila7/claude-code-templates

Automate Slack workspace operations including messaging, search, channel management, and reaction workflows through Composio's Slack toolkit.

linear-automation

24269
from davila7/claude-code-templates

Automate Linear tasks via Rube MCP (Composio): issues, projects, cycles, teams, labels. Always search tools first for current schemas.

jira-automation

24269
from davila7/claude-code-templates

Automate Jira tasks via Rube MCP (Composio): issues, projects, sprints, boards, comments, users. Always search tools first for current schemas.

gitops-workflow

24269
from davila7/claude-code-templates

Complete guide to implementing GitOps workflows with ArgoCD and Flux for automated Kubernetes deployments.

github-automation

24269
from davila7/claude-code-templates

Automate GitHub repositories, issues, pull requests, branches, CI/CD, and permissions via Rube MCP (Composio). Manage code workflows, review PRs, search code, and handle deployments programmatically.

github-actions-templates

24269
from davila7/claude-code-templates

Production-ready GitHub Actions workflow patterns for testing, building, and deploying applications.

zustand-store-ts

24269
from davila7/claude-code-templates

Create Zustand stores following established patterns with proper TypeScript types and middleware.

zod-validation-expert

24269
from davila7/claude-code-templates

Expert in Zod — TypeScript-first schema validation. Covers parsing, custom errors, refinements, type inference, and integration with React Hook Form, Next.js, and tRPC.

tanstack-query-expert

24269
from davila7/claude-code-templates

Expert in TanStack Query (React Query) — asynchronous state management. Covers data fetching, stale time configuration, mutations, optimistic updates, and Next.js App Router (SSR) integration.

tailwind-design-system

24269
from davila7/claude-code-templates

Build production-ready design systems with Tailwind CSS, including design tokens, component variants, responsive patterns, and accessibility.