ai-engineer

Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations. Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications.

85 stars

Best use case

ai-engineer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations. Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications.

Teams using ai-engineer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-engineer/SKILL.md --create-dirs "https://raw.githubusercontent.com/curiositech/some_claude_skills/main/.claude/skills/ai-engineer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ai-engineer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ai-engineer Compares

Feature / Agentai-engineerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations. Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Engineer

Expert in building production-ready LLM applications, from simple chatbots to complex multi-agent systems. Specializes in RAG architectures, vector databases, prompt management, and enterprise AI deployments.

## Quick Start

```
User: "Build a customer support chatbot with our product documentation"

AI Engineer:
1. Design RAG architecture (chunking, embedding, retrieval)
2. Set up vector database (Pinecone/Weaviate/Chroma)
3. Implement retrieval pipeline with reranking
4. Build conversation management with context
5. Add guardrails and fallback handling
6. Deploy with monitoring and observability
```

**Result**: Production-ready AI chatbot in days, not weeks

## Core Competencies

### 1. RAG System Design
| Component | Implementation | Best Practices |
|-----------|---------------|----------------|
| **Chunking** | Semantic, token-based, hierarchical | 512-1024 tokens, overlap 10-20% |
| **Embedding** | OpenAI, Cohere, local models | Match model to domain |
| **Vector DB** | Pinecone, Weaviate, Chroma, Qdrant | Index by use case |
| **Retrieval** | Dense, sparse, hybrid | Start hybrid, tune |
| **Reranking** | Cross-encoder, Cohere Rerank | Always rerank top-k |

### 2. LLM Application Patterns
- Chat with memory and context management
- Agentic workflows with tool use
- Multi-model orchestration (router + specialists)
- Structured output generation (JSON, XML)
- Streaming responses with error handling

### 3. Production Operations
- Token usage tracking and cost optimization
- Latency monitoring and caching strategies
- A/B testing for prompt versions
- Fallback chains and graceful degradation
- Security (prompt injection, PII handling)

## Architecture Patterns

### Basic RAG Pipeline

```typescript
// Simple RAG implementation
async function ragQuery(query: string): Promise<string> {
  // 1. Embed the query
  const queryEmbedding = await embed(query);

  // 2. Retrieve relevant chunks
  const chunks = await vectorDb.query({
    vector: queryEmbedding,
    topK: 10,
    includeMetadata: true
  });

  // 3. Rerank for relevance
  const reranked = await reranker.rank(query, chunks);
  const topChunks = reranked.slice(0, 5);

  // 4. Generate response with context
  const response = await llm.chat({
    system: SYSTEM_PROMPT,
    messages: [
      { role: 'user', content: buildPrompt(query, topChunks) }
    ]
  });

  return response.content;
}
```

### Agent Architecture

```typescript
// Agentic loop with tool use
interface Agent {
  systemPrompt: string;
  tools: Tool[];
  maxIterations: number;
}

async function runAgent(agent: Agent, task: string): Promise<string> {
  const messages: Message[] = [];
  let iterations = 0;

  while (iterations < agent.maxIterations) {
    const response = await llm.chat({
      system: agent.systemPrompt,
      messages: [...messages, { role: 'user', content: task }],
      tools: agent.tools
    });

    if (!response.toolCalls) {
      return response.content; // Final answer
    }

    // Execute tools and continue
    const toolResults = await executeTools(response.toolCalls);
    messages.push({ role: 'assistant', content: response });
    messages.push({ role: 'tool', content: toolResults });
    iterations++;
  }

  throw new Error('Max iterations exceeded');
}
```

### Multi-Model Router

```typescript
// Route queries to appropriate models
const MODEL_ROUTER = {
  simple: 'claude-3-haiku',     // Fast, cheap
  moderate: 'claude-3-sonnet',   // Balanced
  complex: 'claude-3-opus',      // Best quality
};

function routeQuery(query: string, context: any): ModelId {
  // Classify complexity
  if (isSimpleQuery(query)) return MODEL_ROUTER.simple;
  if (requiresReasoning(query, context)) return MODEL_ROUTER.complex;
  return MODEL_ROUTER.moderate;
}
```

## Implementation Checklist

### RAG System
- [ ] Document ingestion pipeline
- [ ] Chunking strategy (semantic preferred)
- [ ] Embedding model selection
- [ ] Vector database setup
- [ ] Retrieval with hybrid search
- [ ] Reranking layer
- [ ] Citation/source tracking
- [ ] Evaluation metrics (relevance, faithfulness)

### Production Readiness
- [ ] Error handling and retries
- [ ] Rate limiting
- [ ] Token tracking
- [ ] Cost monitoring
- [ ] Latency metrics
- [ ] Caching layer
- [ ] Fallback responses
- [ ] PII filtering
- [ ] Prompt injection guards

### Observability
- [ ] Request logging
- [ ] Response quality scoring
- [ ] User feedback collection
- [ ] A/B test framework
- [ ] Drift detection
- [ ] Alert thresholds

## Anti-Patterns

### Anti-Pattern: RAG Everything
**What it looks like**: Using RAG for every query
**Why wrong**: Adds latency, cost, and complexity when unnecessary
**Instead**: Classify queries, use RAG only when context needed

### Anti-Pattern: Chunking by Character
**What it looks like**: `text.slice(0, 1000)` for chunks
**Why wrong**: Breaks semantic meaning, poor retrieval
**Instead**: Semantic chunking respecting document structure

### Anti-Pattern: No Reranking
**What it looks like**: Using raw vector similarity as final ranking
**Why wrong**: Embedding similarity != relevance for query
**Instead**: Always add cross-encoder reranking

### Anti-Pattern: Unbounded Context
**What it looks like**: Stuffing all retrieved chunks into prompt
**Why wrong**: Dilutes relevance, wastes tokens, confuses model
**Instead**: Top 3-5 chunks after reranking, dynamic selection

### Anti-Pattern: No Guardrails
**What it looks like**: Direct user input to LLM
**Why wrong**: Prompt injection, toxic outputs, off-topic responses
**Instead**: Input validation, output filtering, topic guardrails

## Technology Stack

### Vector Databases
| Database | Best For | Notes |
|----------|----------|-------|
| **Pinecone** | Production, scale | Managed, fast |
| **Weaviate** | Hybrid search | GraphQL, modules |
| **Chroma** | Development, local | Embedded, simple |
| **Qdrant** | Self-hosted, filters | Rust, performant |
| **pgvector** | Existing Postgres | Easy integration |

### LLM Frameworks
| Framework | Best For | Notes |
|-----------|----------|-------|
| **LangChain** | Prototyping | Many integrations |
| **LlamaIndex** | RAG focus | Document handling |
| **Vercel AI SDK** | Streaming, React | Edge-ready |
| **Anthropic SDK** | Direct API | Full control |

### Embedding Models
| Model | Dimensions | Notes |
|-------|------------|-------|
| **text-embedding-3-large** | 3072 | Best quality |
| **text-embedding-3-small** | 1536 | Cost-effective |
| **voyage-2** | 1024 | Code, technical |
| **bge-large** | 1024 | Open source |

## When to Use

**Use for:**
- Building chatbots and conversational AI
- Implementing RAG systems
- Creating AI agents with tools
- Designing multi-model architectures
- Production AI deployments

**Do NOT use for:**
- Prompt optimization (use prompt-engineer)
- ML model training (use ml-engineer)
- Data pipelines (use data-pipeline-engineer)
- General backend (use backend-architect)

---

**Core insight**: Production AI systems need more than good prompts—they need robust retrieval, intelligent routing, comprehensive monitoring, and graceful failure handling.

**Use with**: prompt-engineer (optimization) | chatbot-analytics (monitoring) | backend-architect (infrastructure)

Related Skills

vr-avatar-engineer

85
from curiositech/some_claude_skills

Expert in photorealistic and stylized VR avatar systems for Apple Vision Pro, Meta Quest, and cross-platform metaverse. Specializes in facial tracking (52+ blend shapes), subsurface scattering, Persona-style generation, Photon networking, and real-time LOD. Activate on 'VR avatar', 'Vision Pro Persona', 'Meta avatar', 'facial tracking', 'blend shapes', 'avatar networking', 'photorealistic avatar'. NOT for 2D profile pictures (use image generation), non-VR game characters (use game engine tools), static 3D models (use modeling tools), or motion capture hardware setup.

voice-audio-engineer

85
from curiositech/some_claude_skills

Expert in voice synthesis, TTS, voice cloning, podcast production, speech processing, and voice UI design via ElevenLabs integration. Specializes in vocal clarity, loudness standards (LUFS), de-essing, dialogue mixing, and voice transformation. Activate on 'TTS', 'text-to-speech', 'voice clone', 'voice synthesis', 'ElevenLabs', 'podcast', 'voice recording', 'speech-to-speech', 'voice UI', 'audiobook', 'dialogue'. NOT for spatial audio (use sound-engineer), music production (use DAW tools), game audio middleware (use sound-engineer), sound effects generation (use sound-engineer with ElevenLabs SFX), or live concert audio.

sound-engineer

85
from curiositech/some_claude_skills

Expert in spatial audio, procedural sound design, game audio middleware, and app UX sound design. Specializes in HRTF/Ambisonics, Wwise/FMOD integration, UI sound design, and adaptive music systems. Activate on 'spatial audio', 'HRTF', 'binaural', 'Wwise', 'FMOD', 'procedural sound', 'footstep system', 'adaptive music', 'UI sounds', 'notification audio', 'sonic branding'. NOT for music composition/production (use DAW), audio post-production for film (linear media), voice cloning/TTS (use voice-audio-engineer), podcast editing (use standard audio editors), or hardware design.

site-reliability-engineer

85
from curiositech/some_claude_skills

Docusaurus build health validation and deployment safety for Claude Skills showcase. Pre-commit MDX validation (Liquid syntax, angle brackets, prop mismatches), pre-build link checking, post-build health reports. Activate on 'build errors', 'commit hooks', 'deployment safety', 'site health', 'MDX validation'. NOT for general DevOps (use deployment-engineer), Kubernetes/cloud infrastructure (use kubernetes-architect), runtime monitoring (use observability-engineer), or non-Docusaurus projects.

prompt-engineer

85
from curiositech/some_claude_skills

Expert prompt optimization for LLMs and AI systems. Use PROACTIVELY when building AI features, improving agent performance, or crafting system prompts. Masters prompt patterns and techniques.

data-pipeline-engineer

85
from curiositech/some_claude_skills

Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming data, batch processing, data quality. NOT for: API design (use api-architect), ML training (use ML skills), dashboards (use design skills).

skill-coach

85
from curiositech/some_claude_skills

Guides creation of high-quality Agent Skills with domain expertise, anti-pattern detection, and progressive disclosure best practices. Use when creating skills, reviewing existing skills, or when users mention improving skill quality, encoding expertise, or avoiding common AI tooling mistakes. Activate on keywords: create skill, review skill, skill quality, skill best practices, skill anti-patterns. NOT for general coding advice or non-skill Claude Code features.

3d-cv-labeling-2026

85
from curiositech/some_claude_skills

Expert in 3D computer vision labeling tools, workflows, and AI-assisted annotation for LiDAR, point clouds, and sensor fusion. Covers SAM4D/Point-SAM, human-in-the-loop architectures, and vertical-specific training strategies. Activate on '3D labeling', 'point cloud annotation', 'LiDAR labeling', 'SAM 3D', 'SAM4D', 'sensor fusion annotation', '3D bounding box', 'semantic segmentation point cloud'. NOT for 2D image labeling (use clip-aware-embeddings), general ML training (use ml-engineer), video annotation without 3D (use computer-vision-pipeline), or VLM prompt engineering (use prompt-engineer).

wisdom-accountability-coach

85
from curiositech/some_claude_skills

Longitudinal memory tracking, philosophy teaching, and personal accountability with compassion. Expert in pattern recognition, Stoicism/Buddhism, and growth guidance. Activate on 'accountability', 'philosophy', 'Stoicism', 'Buddhism', 'personal growth', 'commitment tracking', 'wisdom teaching'. NOT for therapy or mental health treatment (refer to professionals), crisis intervention, or replacing professional coaching credentials.

windows-95-web-designer

85
from curiositech/some_claude_skills

Modern web applications with authentic Windows 95 aesthetic. Gradient title bars, Start menu paradigm, taskbar patterns, 3D beveled chrome. Extrapolates Win95 to AI chatbots, mobile UIs, responsive layouts. Activate on 'windows 95', 'win95', 'start menu', 'taskbar', 'retro desktop', '95 aesthetic', 'clippy'. NOT for Windows 3.1 (use windows-3-1-web-designer), vaporwave/synthwave, macOS, flat design.

windows-3-1-web-designer

85
from curiositech/some_claude_skills

Modern web applications with authentic Windows 3.1 aesthetic. Solid navy title bars, Program Manager navigation, beveled borders, single window controls. Extrapolates Win31 to AI chatbots (Cue Card paradigm), mobile UIs (pocket computing). Activate on 'windows 3.1', 'win31', 'program manager', 'retro desktop', '90s aesthetic', 'beveled'. NOT for Windows 95 (use windows-95-web-designer - has gradients, Start menu), vaporwave/synthwave, macOS, flat design.

win31-pixel-art-designer

85
from curiositech/some_claude_skills

Expert in Windows 3.1 era pixel art and graphics. Creates icons, banners, splash screens, and UI assets with authentic 16/256-color palettes, dithering patterns, and Program Manager styling. Activate on 'win31 icons', 'pixel art 90s', 'retro icons', '16-color', 'dithering', 'program manager icons', 'VGA palette'. NOT for modern flat icons, vaporwave art, or high-res illustrations.