ai-agents

AI agent architecture, tool use, memory systems, multi-agent orchestration, and safety patterns

39 stars

byInugamiDev

View on GitHub Installation ↓

Best use case

ai-agents is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

AI agent architecture, tool use, memory systems, multi-agent orchestration, and safety patterns

Teams using ai-agents should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-agents/SKILL.md --create-dirs "https://raw.githubusercontent.com/InugamiDev/ultrathink-oss/main/.claude/skills/ai-agents/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ai-agents/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ai-agents Compares

Feature / Agent	ai-agents	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

AI agent architecture, tool use, memory systems, multi-agent orchestration, and safety patterns

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Agents Skill

## Purpose

Design and implement AI agent systems that can reason, use tools, maintain memory, and coordinate with other agents. This skill covers the ReAct pattern, tool definition, memory architectures (short-term, long-term, episodic), multi-agent orchestration, and safety guardrails. Agents are not just chatbots -- they are autonomous systems that take actions in the world.

## Key Concepts

### Agent Architecture

```
PERCEPTION:
  Input parsing, context extraction, intent classification

REASONING:
  Chain-of-thought, planning, self-reflection, error correction

ACTION:
  Tool selection, parameter construction, execution

MEMORY:
  Working memory (current conversation)
  Short-term memory (recent interactions, scratch pad)
  Long-term memory (persistent knowledge, embeddings)
  Episodic memory (past task execution records)

LOOP:
  Observe -> Think -> Act -> Observe -> Think -> Act -> ... -> Done
```

### Agent Patterns

```
ReAct (Reason + Act):
  Thought: I need to find the user's order status.
  Action: query_database(order_id="ord_123")
  Observation: Order status is "shipped", tracking: "1Z999AA..."
  Thought: I have the information. I'll respond to the user.
  Answer: Your order has been shipped! Tracking: 1Z999AA...

Plan-and-Execute:
  Plan: [Step 1: Search products, Step 2: Compare prices, Step 3: Recommend]
  Execute each step, revise plan if needed

Reflection:
  After completing a task, evaluate quality and retry if insufficient

Multi-Agent:
  Researcher agent -> Analyst agent -> Writer agent -> Reviewer agent
  Each agent specialized for one part of the workflow
```

## Patterns

### Tool Definition

```typescript
// Tools are typed schemas that the LLM can invoke
interface ToolDefinition {
  name: string;
  description: string;
  parameters: {
    type: 'object';
    properties: Record<string, { type: string; description: string; enum?: string[] }>;
    required: string[];
  };
}

const tools: ToolDefinition[] = [
  {
    name: 'search_products',
    description: 'Search the product catalog by query string. Returns matching products with prices.',
    parameters: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search query' },
        category: { type: 'string', description: 'Product category filter', enum: ['electronics', 'clothing', 'books'] },
        maxPrice: { type: 'number', description: 'Maximum price in dollars' },
      },
      required: ['query'],
    },
  },
  {
    name: 'get_order_status',
    description: 'Get the current status of a customer order by order ID.',
    parameters: {
      type: 'object',
      properties: {
        orderId: { type: 'string', description: 'The order ID (format: ord_xxx)' },
      },
      required: ['orderId'],
    },
  },
];
```

### Agent Loop with Anthropic SDK

```typescript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function runAgent(userMessage: string, tools: ToolDefinition[]) {
  const messages: Anthropic.MessageParam[] = [
    { role: 'user', content: userMessage },
  ];

  const MAX_ITERATIONS = 10;

  for (let i = 0; i < MAX_ITERATIONS; i++) {
    const response = await client.messages.create({
      model: 'claude-sonnet-4-20250514',
      max_tokens: 4096,
      system: 'You are a helpful assistant. Use tools when needed to answer questions.',
      tools,
      messages,
    });

    // Check if the model wants to use a tool
    if (response.stop_reason === 'tool_use') {
      const toolUseBlocks = response.content.filter(
        (block) => block.type === 'tool_use'
      );

      // Add assistant response to conversation
      messages.push({ role: 'assistant', content: response.content });

      // Execute each tool call
      const toolResults = [];
      for (const toolUse of toolUseBlocks) {
        const result = await executeTool(toolUse.name, toolUse.input);
        toolResults.push({
          type: 'tool_result' as const,
          tool_use_id: toolUse.id,
          content: JSON.stringify(result),
        });
      }

      messages.push({ role: 'user', content: toolResults });
    } else {
      // Model is done, return the final response
      return response.content;
    }
  }

  throw new Error('Agent exceeded maximum iterations');
}

async function executeTool(name: string, input: Record<string, unknown>) {
  switch (name) {
    case 'search_products':
      return await searchProducts(input.query as string, input);
    case 'get_order_status':
      return await getOrderStatus(input.orderId as string);
    default:
      throw new Error(`Unknown tool: ${name}`);
  }
}
```

### Memory System

```typescript
interface Memory {
  // Working memory: current conversation context
  conversation: Message[];

  // Short-term memory: recent scratchpad
  scratchpad: Map<string, unknown>;

  // Long-term memory: persistent knowledge (vector store)
  retrieve(query: string, k?: number): Promise<MemoryEntry[]>;
  store(content: string, metadata?: Record<string, unknown>): Promise<void>;

  // Episodic memory: past task records
  getEpisode(taskId: string): Promise<Episode | null>;
  recordEpisode(episode: Episode): Promise<void>;
}

interface Episode {
  taskId: string;
  objective: string;
  steps: Array<{ thought: string; action: string; result: string }>;
  outcome: 'success' | 'failure';
  reflection: string;
  timestamp: Date;
}

// Before starting a new task, check if we have done something similar
async function planWithMemory(objective: string, memory: Memory) {
  const similarEpisodes = await memory.retrieve(objective, 3);
  const relevantKnowledge = await memory.retrieve(objective, 5);

  const context = `
    Similar past tasks:
    ${similarEpisodes.map(e => `- ${e.content} (${e.metadata?.outcome})`).join('\n')}

    Relevant knowledge:
    ${relevantKnowledge.map(k => `- ${k.content}`).join('\n')}
  `;

  return context;
}
```

### Multi-Agent Orchestration

```typescript
interface Agent {
  name: string;
  systemPrompt: string;
  tools: ToolDefinition[];
  run(input: string): Promise<string>;
}

// Sequential pipeline
async function researchPipeline(question: string) {
  const researcher = createAgent('researcher', researcherPrompt, [webSearch, readDoc]);
  const analyst = createAgent('analyst', analystPrompt, [calculateMetrics]);
  const writer = createAgent('writer', writerPrompt, [formatReport]);

  const rawFindings = await researcher.run(question);
  const analysis = await analyst.run(`Analyze these findings:\n${rawFindings}`);
  const report = await writer.run(`Write a report from:\n${analysis}`);

  return report;
}

// Supervisor pattern
async function supervisorLoop(objective: string, agents: Agent[]) {
  const supervisor = createAgent('supervisor', supervisorPrompt, [
    { name: 'delegate', description: 'Assign a task to an agent', parameters: { ... } },
    { name: 'complete', description: 'Mark the objective as done', parameters: { ... } },
  ]);

  let result = '';
  for (let i = 0; i < 10; i++) {
    const action = await supervisor.run(`Objective: ${objective}\nProgress: ${result}`);
    if (action.tool === 'complete') return action.output;
    const agent = agents.find(a => a.name === action.delegateTo);
    result += await agent!.run(action.task);
  }
}
```

## Safety Guardrails

```
INPUT VALIDATION:
  - Sanitize user input before passing to LLM
  - Validate tool parameters before execution
  - Reject prompt injection attempts

OUTPUT VALIDATION:
  - Verify tool outputs are within expected bounds
  - Filter sensitive information from responses
  - Check for hallucinated tool calls

EXECUTION BOUNDARIES:
  - Maximum iterations per agent run
  - Token budget per task
  - Allowlist of permitted tools
  - Human-in-the-loop for destructive actions
  - Rate limiting on expensive operations

MONITORING:
  - Log all tool invocations with inputs and outputs
  - Track token usage and cost per agent run
  - Alert on unusual patterns (loops, excessive tool calls)
```

## Best Practices

1. **Start with a single agent** -- add multi-agent only when single agent hits capability limits
2. **Define tools precisely** -- vague descriptions cause the LLM to misuse tools
3. **Set iteration limits** -- agents can loop; always cap at a maximum
4. **Log everything** -- tool calls, reasoning steps, and outcomes for debugging
5. **Human-in-the-loop for irreversible actions** -- deletion, payments, sending emails
6. **Validate tool outputs** -- do not blindly trust external API responses
7. **Use structured output** -- JSON mode or tool use, not free-form text parsing
8. **Budget tokens** -- set max_tokens and track cumulative usage
9. **Test with adversarial inputs** -- prompt injection, impossible tasks, ambiguous requests
10. **Prefer retrieval over memorization** -- use RAG instead of stuffing context into the system prompt

## Common Pitfalls

| Pitfall | Impact | Fix |
|---------|--------|-----|
| No iteration limit | Infinite loop, cost explosion | Cap at 10-20 iterations |
| Vague tool descriptions | LLM calls wrong tool or wrong params | Write precise, example-rich descriptions |
| No human-in-the-loop | Agent takes irreversible destructive actions | Require confirmation for high-risk tools |
| Stuffing all context in prompt | Token limit exceeded, degraded quality | Use RAG for dynamic context |
| Not logging tool calls | Cannot debug agent failures | Log every tool invocation |
| Trusting LLM output as code | Injection, errors, hallucinations | Validate and sandbox all LLM-generated code |

Related Skills

ultrathink

from InugamiDev/ultrathink-oss

UltraThink Workflow OS — 4-layer skill mesh with persistent memory and privacy hooks for complex engineering tasks. Routes prompts through intent detection to activate the right domain skills automatically.

ultrathink_review

from InugamiDev/ultrathink-oss

Multi-pass code review powered by UltraThink's quality gate — checks correctness, security (OWASP), performance, readability, and project conventions in a single structured pass.

ultrathink_memory

from InugamiDev/ultrathink-oss

Persistent memory system for UltraThink — search, save, and recall project context, decisions, and patterns across sessions using Postgres-backed fuzzy search with synonym expansion.

ui-design

from InugamiDev/ultrathink-oss

Comprehensive UI design system: 230+ font pairings, 48 themes, 65 design systems, 23 design languages, 30 UX laws, 14 color systems, Swiss grid, Gestalt principles, Pencil.dev workflow. Inherits ui-ux-pro-max (99 UX rules) + impeccable-frontend-design (anti-AI-slop). Triggers on any design, UI, layout, typography, color, theme, or styling task.

Zod

from InugamiDev/ultrathink-oss

> TypeScript-first schema validation with static type inference.

webinar-registration-page

from InugamiDev/ultrathink-oss

Build a webinar or live event registration page as a self-contained HTML file with countdown timer, speaker bio, agenda, and registration form. Triggers on: "build a webinar registration page", "create a webinar sign-up page", "event registration landing page", "live training registration page", "workshop sign-up page", "create a webinar page", "build an event page", "free webinar landing page", "live demo registration page", "online event page", "create a registration page for my webinar", "build a training event page".

webhooks

from InugamiDev/ultrathink-oss

Webhook design patterns — delivery, retry with exponential backoff, HMAC signature verification, payload validation, idempotency keys

web-workers

from InugamiDev/ultrathink-oss

Offload heavy computation from the main thread using Web Workers, SharedWorkers, and Comlink — structured messaging, transferable objects, and off-main-thread architecture patterns

web-vitals

from InugamiDev/ultrathink-oss

Core Web Vitals monitoring (LCP, FID, CLS, INP, TTFB), measurement with web-vitals library, reporting to analytics, and optimization strategies for Next.js

web-components

from InugamiDev/ultrathink-oss

Native Web Components, custom elements API, Shadow DOM, HTML templates, slots, lifecycle callbacks, and framework-agnostic design patterns

wasm

from InugamiDev/ultrathink-oss

WebAssembly integration — Rust to WASM with wasm-pack/wasm-bindgen, WASI, browser usage, server-side WASM, and performance considerations

vue

from InugamiDev/ultrathink-oss

Vue 3 Composition API, Nuxt patterns, reactivity system, component architecture, and production development practices