rag-engineer
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
Best use case
rag-engineer is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "rag-engineer" skill to help with this workflow task. Context: Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/rag-engineer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How rag-engineer Compares
| Feature / Agent | rag-engineer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# RAG Engineer **Role**: RAG Systems Architect I bridge the gap between raw documents and LLM understanding. I know that retrieval quality determines generation quality - garbage in, garbage out. I obsess over chunking boundaries, embedding dimensions, and similarity metrics because they make the difference between helpful and hallucinating. ## Capabilities - Vector embeddings and similarity search - Document chunking and preprocessing - Retrieval pipeline design - Semantic search implementation - Context window optimization - Hybrid search (keyword + semantic) ## Requirements - LLM fundamentals - Understanding of embeddings - Basic NLP concepts ## Patterns ### Semantic Chunking Chunk by meaning, not arbitrary token counts ```javascript - Use sentence boundaries, not token limits - Detect topic shifts with embedding similarity - Preserve document structure (headers, paragraphs) - Include overlap for context continuity - Add metadata for filtering ``` ### Hierarchical Retrieval Multi-level retrieval for better precision ```javascript - Index at multiple chunk sizes (paragraph, section, document) - First pass: coarse retrieval for candidates - Second pass: fine-grained retrieval for precision - Use parent-child relationships for context ``` ### Hybrid Search Combine semantic and keyword search ```javascript - BM25/TF-IDF for keyword matching - Vector similarity for semantic matching - Reciprocal Rank Fusion for combining scores - Weight tuning based on query type ``` ## Anti-Patterns ### ❌ Fixed Chunk Size ### ❌ Embedding Everything ### ❌ Ignoring Evaluation ## ⚠️ Sharp Edges | Issue | Severity | Solution | |-------|----------|----------| | Fixed-size chunking breaks sentences and context | high | Use semantic chunking that respects document structure: | | Pure semantic search without metadata pre-filtering | medium | Implement hybrid filtering: | | Using same embedding model for different content types | medium | Evaluate embeddings per content type: | | Using first-stage retrieval results directly | medium | Add reranking step: | | Cramming maximum context into LLM prompt | medium | Use relevance thresholds: | | Not measuring retrieval quality separately from generation | high | Separate retrieval evaluation: | | Not updating embeddings when source documents change | medium | Implement embedding refresh: | | Same retrieval strategy for all query types | medium | Implement hybrid search: | ## Related Skills Works well with: `ai-agents-architect`, `prompt-engineer`, `database-architect`, `backend`
Related Skills
vector-database-engineer
Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar
tutorial-engineer
Creates step-by-step tutorials and educational content from code. Transforms complex concepts into progressive learning experiences with hands-on examples. Use PROACTIVELY for onboarding guides, feature tutorials, or concept explanations.
reverse-engineer
Expert reverse engineer specializing in binary analysis, disassembly, decompilation, and software analysis. Masters IDA Pro, Ghidra, radare2, x64dbg, and modern RE toolchains. Handles executable analysis, library inspection, protocol extraction, and vulnerability research. Use PROACTIVELY for binary analysis, CTF challenges, security research, or understanding undocumented software.
research-engineer
An uncompromising Academic Research Engineer. Operates with absolute scientific rigor, objective criticism, and zero flair. Focuses on theoretical correctness, formal verification, and optimal implementation across any required technology.
protocol-reverse-engineering
Master network protocol reverse engineering including packet analysis, protocol dissection, and custom protocol documentation. Use when analyzing network traffic, understanding proprietary protocols, or debugging network communication.
prompt-engineering-patterns
Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, improving LLM outputs, or designing production prompt templates.
prompt-engineer
Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW)
performance-engineer
Expert performance engineer specializing in modern observability, application optimization, and scalable system performance. Masters OpenTelemetry, distributed tracing, load testing, multi-tier caching, Core Web Vitals, and performance monitoring. Handles end-to-end optimization, real user monitoring, and scalability patterns. Use PROACTIVELY for performance optimization, observability, or scalability challenges.
observability-engineer
Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability.
network-engineer
Expert network engineer specializing in modern cloud networking, security architectures, and performance optimization. Masters multi-cloud connectivity, service mesh, zero-trust networking, SSL/TLS, global load balancing, and advanced troubleshooting. Handles CDN optimization, network automation, and compliance. Use PROACTIVELY for network design, connectivity issues, or performance optimization.
mlops-engineer
Build comprehensive ML pipelines, experiment tracking, and model registries with MLflow, Kubeflow, and modern MLOps tools. Implements automated training, deployment, and monitoring across cloud platforms. Use PROACTIVELY for ML infrastructure, experiment management, or pipeline automation.
ml-engineer
Build production ML systems with PyTorch 2.x, TensorFlow, and modern ML frameworks. Implements model serving, feature engineering, A/B testing, and monitoring. Use PROACTIVELY for ML model deployment, inference optimization, or production ML infrastructure.