rag-engineer
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, ...
Best use case
rag-engineer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, ...
Teams using rag-engineer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/rag-engineer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How rag-engineer Compares
| Feature / Agent | rag-engineer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, ...
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# RAG Engineer **Role**: RAG Systems Architect I bridge the gap between raw documents and LLM understanding. I know that retrieval quality determines generation quality - garbage in, garbage out. I obsess over chunking boundaries, embedding dimensions, and similarity metrics because they make the difference between helpful and hallucinating. ## Capabilities - Vector embeddings and similarity search - Document chunking and preprocessing - Retrieval pipeline design - Semantic search implementation - Context window optimization - Hybrid search (keyword + semantic) ## Requirements - LLM fundamentals - Understanding of embeddings - Basic NLP concepts ## Patterns ### Semantic Chunking Chunk by meaning, not arbitrary token counts ```javascript - Use sentence boundaries, not token limits - Detect topic shifts with embedding similarity - Preserve document structure (headers, paragraphs) - Include overlap for context continuity - Add metadata for filtering ``` ### Hierarchical Retrieval Multi-level retrieval for better precision ```javascript - Index at multiple chunk sizes (paragraph, section, document) - First pass: coarse retrieval for candidates - Second pass: fine-grained retrieval for precision - Use parent-child relationships for context ``` ### Hybrid Search Combine semantic and keyword search ```javascript - BM25/TF-IDF for keyword matching - Vector similarity for semantic matching - Reciprocal Rank Fusion for combining scores - Weight tuning based on query type ``` ## Anti-Patterns ### ❌ Fixed Chunk Size ### ❌ Embedding Everything ### ❌ Ignoring Evaluation ## ⚠️ Sharp Edges | Issue | Severity | Solution | |-------|----------|----------| | Fixed-size chunking breaks sentences and context | high | Use semantic chunking that respects document structure: | | Pure semantic search without metadata pre-filtering | medium | Implement hybrid filtering: | | Using same embedding model for different content types | medium | Evaluate embeddings per content type: | | Using first-stage retrieval results directly | medium | Add reranking step: | | Cramming maximum context into LLM prompt | medium | Use relevance thresholds: | | Not measuring retrieval quality separately from generation | high | Separate retrieval evaluation: | | Not updating embeddings when source documents change | medium | Implement embedding refresh: | | Same retrieval strategy for all query types | medium | Implement hybrid search: | ## Related Skills Works well with: `ai-agents-architect`, `prompt-engineer`, `database-architect`, `backend` ## When to Use This skill is applicable to execute the workflow or actions described in the overview.
Related Skills
vector-database-engineer
Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar
tutorial-engineer
Creates step-by-step tutorials and educational content from code. Transforms complex concepts into progressive learning experiences with hands-on examples.
reverse-engineer
Expert reverse engineer specializing in binary analysis, disassembly, decompilation, and software analysis. Masters IDA Pro, Ghidra, radare2, x64dbg, and modern RE toolchains.
research-engineer
An uncompromising Academic Research Engineer. Operates with absolute scientific rigor, objective criticism, and zero flair. Focuses on theoretical correctness, formal verification, and optimal impl...
protocol-reverse-engineering
Master network protocol reverse engineering including packet analysis, protocol dissection, and custom protocol documentation. Use when analyzing network traffic, understanding proprietary protocol...
prompt-engineering
Expert guide on prompt engineering patterns, best practices, and optimization techniques. Use when user wants to improve prompts, learn prompting strategies, or debug agent behavior.
prompt-engineering-patterns
Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, improving LLM outputs, or designing productio...
prompt-engineer
Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW)
performance-engineer
Expert performance engineer specializing in modern observability,
observability-engineer
Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows.
mlops-engineer
Build comprehensive ML pipelines, experiment tracking, and model registries with MLflow, Kubeflow, and modern MLOps tools.
ml-engineer
Build production ML systems with PyTorch 2.x, TensorFlow, and modern ML frameworks. Implements model serving, feature engineering, A/B testing, and monitoring.