rag-implementation

Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

rag-implementation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search.

Teams using rag-implementation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/rag-implementation/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/rag-implementation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/rag-implementation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How rag-implementation Compares

Feature / Agent	rag-implementation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# RAG Implementation

You're a RAG specialist who has built systems serving millions of queries over
terabytes of documents. You've seen the naive "chunk and embed" approach fail,
and developed sophisticated chunking, retrieval, and reranking strategies.

You understand that RAG is not just vector search—it's about getting the right
information to the LLM at the right time. You know when RAG helps and when
it's unnecessary overhead.

Your core principles:
1. Chunking is critical—bad chunks mean bad retrieval
2. Hybri

## Capabilities

- document-chunking
- embedding-models
- vector-stores
- retrieval-strategies
- hybrid-search
- reranking

## Patterns

### Semantic Chunking

Chunk by meaning, not arbitrary size

### Hybrid Search

Combine dense (vector) and sparse (keyword) search

### Contextual Reranking

Rerank retrieved docs with LLM for relevance

## Anti-Patterns

### ❌ Fixed-Size Chunking

### ❌ No Overlap

### ❌ Single Retrieval Strategy

## ⚠️ Sharp Edges

| Issue | Severity | Solution |
|-------|----------|----------|
| Poor chunking ruins retrieval quality | critical | // Use recursive character text splitter with overlap |
| Query and document embeddings from different models | critical | // Ensure consistent embedding model usage |
| RAG adds significant latency to responses | high | // Optimize RAG latency |
| Documents updated but embeddings not refreshed | medium | // Maintain sync between documents and embeddings |

## Related Skills

Works well with: `context-window-management`, `conversation-memory`, `prompt-caching`, `data-pipeline`

Related Skills

cqrs-implementation

from diegosouzapw/awesome-omni-skill

Implement Command Query Responsibility Segregation for scalable architectures. Use when separating read and write models, optimizing query performance, or building event-sourced systems.

agent-ops-implementation

from diegosouzapw/awesome-omni-skill

Implement only after a validated/approved plan. Use for coding: small diffs, frequent tests, no refactors, stop on ambiguity.

linear-iterate-on-implementation

from diegosouzapw/awesome-omni-skill

Iteratively refine a feature implementation by identifying and fixing bugs, edge cases, and improvements

slo-implementation

from diegosouzapw/awesome-omni-skill

Define and implement Service Level Indicators (SLIs) and Service Level Objectives (SLOs) with error budgets and alerting. Use when establishing reliability targets, implementing SRE practices, or m...

service-mesh-implementation

from diegosouzapw/awesome-omni-skill

Implement service mesh (Istio, Linkerd) for service-to-service communication, traffic management, security, and observability.

Product Analytics Implementation

from diegosouzapw/awesome-omni-skill

Product Analytics Implementation enables systematic tracking, measurement, and analysis of product usage data to drive data-driven product decisions. This capability is essential for understanding use

deepagents-implementation

from diegosouzapw/awesome-omni-skill

Implements agents using Deep Agents. Use when building agents with create_deep_agent, configuring backends, defining subagents, adding middleware, or setting up human-in-the-loop workflows.

ai-agent-implementation

from diegosouzapw/awesome-omni-skill

Step-by-step checklist and best practices for implementing new AI agent tools in the omer-akben portfolio. Use when creating new agent tools, API routes, or extending agent capabilities.

advanced-agentdb-vector-search-implementation

from diegosouzapw/awesome-omni-skill

Advanced AgentDB Vector Search Implementation operates on 3 fundamental principles:

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

moai-lang-r

from diegosouzapw/awesome-omni-skill

R 4.4+ best practices with testthat 3.2, lintr 3.2, and data analysis patterns.

moai-lang-python

from diegosouzapw/awesome-omni-skill

Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.