ai-tool-designer

Guide for designing effective tools for AI agents. Use when creating tools for custom agent systems or any AI tool interfaces. Provides principles for tool naming, input/output design, error handling, and evaluation methodologies that maximize agent effectiveness.

16 stars

Best use case

ai-tool-designer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Guide for designing effective tools for AI agents. Use when creating tools for custom agent systems or any AI tool interfaces. Provides principles for tool naming, input/output design, error handling, and evaluation methodologies that maximize agent effectiveness.

Teams using ai-tool-designer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-tool-designer/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/tools/ai-tool-designer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ai-tool-designer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ai-tool-designer Compares

Feature / Agentai-tool-designerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Guide for designing effective tools for AI agents. Use when creating tools for custom agent systems or any AI tool interfaces. Provides principles for tool naming, input/output design, error handling, and evaluation methodologies that maximize agent effectiveness.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Agent Tool Designer

## Overview

This skill provides comprehensive guidance for designing tools that AI agents can use effectively. Whether building custom agent tools or any AI-accessible interfaces, these principles maximize agent success in accomplishing real-world tasks.

Note: Use the more specific mcp-builder skill if you want to create an MCP server.

The quality of a tool system is measured not by how comprehensively it implements features, but by how well it enables AI agents to accomplish realistic, complex tasks using only the tools provided.

---

## Agent-Centric Design Principles

Before implementing any tool system, understand these foundational principles for designing tools that AI agents can use effectively:

### 1. Build for Workflows, Not Just API Endpoints

**Principle:** Design thoughtful, high-impact workflow tools rather than simply wrapping existing API endpoints.

**Why it matters:** Agents need to accomplish complete tasks, not just make individual API calls. Tools that consolidate related operations reduce the number of steps agents must take and improve success rates.

**How to apply:**
- Consolidate related operations (e.g., `schedule_event` that both checks availability and creates the event)
- Focus on tools that enable complete tasks, not just individual API calls
- Consider what workflows agents actually need to accomplish, not just what the underlying API offers
- Ask: "What is the user trying to accomplish?" rather than "What does the API provide?"

**Examples:**
- ❌ Bad: Separate tools `check_calendar_availability`, `create_calendar_event`, `send_event_notification`
- ✅ Good: Single tool `schedule_event` with parameters for checking conflicts and sending notifications

### 2. Optimize for Limited Context

**Principle:** Agents have constrained context windows - make every token count.

**Why it matters:** When agents run out of context, they fail to complete tasks. Verbose tool outputs force agents to make difficult decisions about what information to keep or discard.

**How to apply:**
- Return high-signal information, not exhaustive data dumps
- Provide "concise" vs "detailed" response format options (default to concise)
- Default to human-readable identifiers over technical codes (names over IDs when possible)
- Consider the agent's context budget as a scarce resource
- Implement character limits and graceful truncation (typically 25,000 characters)
- Use pagination with reasonable defaults (20-50 items)

**Examples:**
- ❌ Bad: Return all 50 fields from user object including metadata, internal IDs, timestamps in multiple formats
- ✅ Good: Return name, email, role, and key status fields; offer `detailed=true` parameter for full data

### 3. Design Actionable Error Messages

**Principle:** Error messages should guide agents toward correct usage patterns, not just report failures.

**Why it matters:** Agents learn tool usage through feedback. Clear, educational errors help agents self-correct and succeed on retry.

**How to apply:**
- Suggest specific next steps in error messages
- Make errors educational, not just diagnostic
- Include examples of correct usage when parameters are invalid
- Guide agents toward solutions: "Try using filter='active_only' to reduce results"
- Avoid technical jargon; use natural language

**Examples:**
- ❌ Bad: "Error 400: Invalid request"
- ✅ Good: "The limit parameter must be between 1-100. You provided 500. Try using limit=50 and pagination with offset to retrieve more results."

### 4. Follow Natural Task Subdivisions

**Principle:** Tool names and organization should reflect how humans think about tasks, not just API structure.

**Why it matters:** Agents use tool names and descriptions to decide which tool to call. Natural naming improves tool discovery and reduces wrong tool selections.

**How to apply:**
- Tool names should reflect human mental models of tasks
- Group related tools with consistent prefixes for discoverability
- Design tools around natural workflows, not just API structure
- Use action-oriented naming: `search_users`, `create_project`, `send_message`
- Include service/system prefix to avoid conflicts: `slack_send_message` not just `send_message`

**Examples:**
- ❌ Bad: `api_endpoint_users_post`, `api_endpoint_users_get`, `api_endpoint_users_delete`
- ✅ Good: `create_user`, `search_users`, `delete_user`

### 5. Use Evaluation-Driven Development

**Principle:** Create realistic evaluation scenarios early and let agent feedback drive tool improvements.

**Why it matters:** Only by testing tools with actual agents can you discover usability issues. Prototype quickly and iterate based on real agent performance.

**How to apply:**
- Create 10+ complex, realistic questions agents should answer using your tools
- Test with actual AI agents attempting to solve these questions
- Observe where agents struggle, make mistakes, or run out of context
- Iterate on tool design based on agent feedback
- Measure success by agent task completion rate, not feature completeness

**Process:**
1. Build initial tools based on these principles
2. Create evaluation questions (see [Evaluation Guide](./references/evaluation_guide.md))
3. Test with agents
4. Identify failure patterns
5. Refine tools
6. Repeat

---

## Tool Design Framework

Follow this systematic framework when designing any tool for AI agents:

### Phase 1: Planning

**1. Identify Core Workflows**
- List the most valuable operations agents need to perform
- Prioritize tools that enable the most common and important use cases
- Consider which tools work together to enable complex workflows

**2. Design Input Schemas**
- Use strong validation (dry-validation for Ruby, JSON Schema)
- Include proper constraints (min/max length, regex patterns, ranges)
- Provide clear, descriptive field descriptions with examples
- Set sensible defaults to reduce required parameters

**3. Design Output Formats**
- Support multiple formats (JSON for programmatic, Markdown for human-readable)
- Define consistent response structures across similar tools
- Plan for large-scale usage (thousands of users/resources)
- Implement character limits and truncation strategies
- Include pagination metadata (`has_more`, `next_offset`, `total_count`)

**4. Plan Error Handling**
- Design clear, actionable, agent-friendly error messages
- Handle authentication and authorization errors gracefully
- Consider rate limiting and timeout scenarios
- Provide guidance on how to proceed after errors

### Phase 2: Implementation

**Tool Naming Conventions:**
- Use snake_case: `search_users`, `create_project`
- Include service prefix: `github_create_issue`, `slack_send_message`
- Be action-oriented: start with verbs (get, list, search, create, update, delete)
- Be specific: avoid generic names that could conflict

**Tool Descriptions:**
Write comprehensive descriptions that include:
- One-line summary of what the tool does
- Detailed explanation of purpose and functionality
- When to use this tool (and when NOT to use it)
- Parameter descriptions with examples
- Return value schema
- Error handling guidance

**Tool Annotations** (if supported by your system):
- `readOnlyHint: true` for read-only operations
- `destructiveHint: false` for non-destructive operations
- `idempotentHint: true` if repeated calls have same effect
- `openWorldHint: true` if interacting with external systems

### Phase 3: Refinement

**Code Quality Checklist:**
- ✅ No duplicated code between tools (DRY principle)
- ✅ Shared logic extracted into reusable functions
- ✅ Similar operations return similar formats (consistency)
- ✅ All external calls have error handling
- ✅ Full type coverage (type hints, TypeScript types)
- ✅ Every tool has comprehensive documentation

**Testing:**
- Test with valid and invalid inputs
- Test error handling paths
- Test with real AI agents using evaluation questions
- Test pagination and large result sets
- Test character limits and truncation

---

## Response Format Guidelines

All tools that return data should support multiple formats for flexibility:

### JSON Format (`response_format="json"`)
**Purpose:** Machine-readable structured data for programmatic processing

**Best practices:**
- Include all available fields and metadata
- Use consistent field names and types
- Suitable for when agents need to process data further
- Return IDs alongside names for precision

**Example:**
```json
{
  "users": [
    {
      "id": "U123456",
      "name": "John Doe",
      "email": "john@example.com",
      "role": "developer",
      "active": true
    }
  ],
  "total": 150,
  "count": 20,
  "has_more": true,
  "next_offset": 20
}
```

### Markdown Format (`response_format="markdown"`, typically default)
**Purpose:** Human-readable formatted text for user presentation

**Best practices:**
- Use headers, lists, and formatting for clarity
- Convert timestamps to readable format ("2024-01-15 10:30 UTC" vs epoch)
- Show display names with IDs in parentheses ("@john.doe (U123456)")
- Omit verbose metadata (show one profile image URL, not all sizes)
- Group related information logically
- Use when presenting information to end users

**Example:**
```markdown
## Users (20 of 150)

- **John Doe** (@john.doe)
  - Email: john@example.com
  - Role: Developer
  - Status: Active

- **Jane Smith** (@jane.smith)
  - Email: jane@example.com
  - Role: Designer
  - Status: Active

*Showing 20 results. Use offset=20 to see more.*
```

---

## Pagination Best Practices

For tools that list resources:

**Implementation requirements:**
- Always respect the `limit` parameter (never load all results when limit specified)
- Implement offset-based or cursor-based pagination
- Return pagination metadata: `has_more`, `next_offset`/`next_cursor`, `total_count`
- Never load all results into memory for large datasets
- Default to reasonable limits (20-50 items typical)

**Response structure:**
```json
{
  "items": [...],
  "total": 150,
  "count": 20,
  "offset": 0,
  "has_more": true,
  "next_offset": 20
}
```

**Clear guidance in responses:**
Include instructions for getting more data:
- "Showing 20 of 150 results. Use offset=20 to see the next page."
- "Results truncated. Add filters to narrow the search."

---

## Character Limits and Truncation

To prevent overwhelming context windows:

**Implementation:**
- Define CHARACTER_LIMIT constant (typically 25,000 characters)
- Check response size before returning
- Truncate gracefully with clear indicators
- Provide guidance on how to filter/paginate for complete results

**Example handling:**
```ruby
CHARACTER_LIMIT = 25_000

if result.length > CHARACTER_LIMIT
  truncated_data = data[0...[1, data.length / 2].max]
  response[:truncated] = true
  response[:truncation_message] =
    "Response truncated from #{data.length} to #{truncated_data.length} items. " \
    "Use 'offset' parameter or add filters like status='active' to see more."
end
```

---

## Input Validation Best Practices

**Security and usability:**
- Validate all parameters against schema before processing
- Sanitize file paths to prevent directory traversal
- Validate URLs and external identifiers
- Check parameter sizes and ranges
- Prevent command injection in system calls
- Return clear validation errors with examples of correct format

**Schema design:**
- Use strong validation (dry-validation, JSON Schema)
- Include constraints (minLength, maxLength, pattern, minimum, maximum)
- Provide detailed field descriptions with examples
- Mark required vs optional parameters clearly
- Set sensible defaults where possible

---

## Resources

This skill includes reference documentation for deeper exploration:

### references/tool_design_patterns.md
Comprehensive patterns and anti-patterns for common tool design scenarios with detailed examples.

### references/evaluation_guide.md
Complete methodology for creating evaluation questions that test tool effectiveness with AI agents, including how to run evaluations and interpret results.

---

## Further Reading

For detailed examples and advanced patterns:
- [Tool Design Patterns](./references/tool_design_patterns.md) - Comprehensive patterns and examples
- [Evaluation Guide](./references/evaluation_guide.md) - Testing methodology and evaluation creation

Related Skills

browserbase-tool-automation

16
from diegosouzapw/awesome-omni-skill

Automate Browserbase Tool tasks via Rube MCP (Composio). Always search tools first for current schemas.

browser-tool-automation

16
from diegosouzapw/awesome-omni-skill

Automate Browser Tool tasks via Rube MCP (Composio). Always search tools first for current schemas.

browser-dev-tools

16
from diegosouzapw/awesome-omni-skill

使用 Chrome DevTools MCP 进行前端页面调试、布局优化、性能诊断及交互验证。

apktool

16
from diegosouzapw/awesome-omni-skill

Android APK unpacking and resource extraction tool for reverse engineering. Use when you need to decode APK files, extract resources, examine AndroidManifest.xml, analyze smali code, or repackage modified APKs.

anthropic-dev-tools-mcp-builder

16
from diegosouzapw/awesome-omni-skill

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

alfworld-tool-locator

16
from diegosouzapw/awesome-omni-skill

This skill searches for a specified tool or device (e.g., a desklamp) within the environment by checking relevant surfaces. It should be triggered when the agent needs a tool to interact with another object as part of the task. The skill takes a tool name as implicit input and outputs navigation actions to likely storage spots (e.g., sidetables, shelves) until the tool is found.

ai-tool-assessment

16
from diegosouzapw/awesome-omni-skill

Assess and report on AI tool subscriptions, usage patterns, and cost-effectiveness. Use for reviewing AI subscriptions, analyzing tool usage, optimizing AI spend.

ai-runtime-toolkit

16
from diegosouzapw/awesome-omni-skill

AI Runtime工具装备系统,支持8个内部专业工具和10+个外部CLI工具的整合管理,提供工具发现、执行和配置功能,遵循整合优于创造的设计理念

ai-image-tools

16
from diegosouzapw/awesome-omni-skill

Generate and edit images using either OpenAI GPT Image 1.5 or Google's Nano Banana Pro (Gemini 3 Pro Image). Use when the user asks to generate/create/edit/modify images. Supports image-to-image editing for both providers and optional mask-based inpainting for OpenAI.

agent-tooling-engineer

16
from diegosouzapw/awesome-omni-skill

Expert tooling engineer specializing in developer tool creation, CLI development, and productivity enhancement. Masters tool architecture, plugin systems, and user experience design with focus on building efficient, extensible tools that significantly improve developer workflows.

agent-ops-tools

16
from diegosouzapw/awesome-omni-skill

Detect available development tools at session start. Saves to .agent/tools.json and warns about missing required tools. Works with or without aoc CLI installed.

add-awesome-tool

16
from diegosouzapw/awesome-omni-skill

This skill should be used when analyzing a link to an AI tool and adding it to the awesome-ai-tools readme with proper categorization