devtu-optimize-descriptions

Optimize tool descriptions in ToolUniverse JSON configs for clarity and usability. Reviews descriptions for missing prerequisites, unexpanded abbreviations, unclear parameters, and missing usage guidance. Use when reviewing tool descriptions, improving API documentation, or when user asks to check if tools are easy to understand.

1,202 stars

bymims-harvard

View on GitHub Installation ↓

Best use case

devtu-optimize-descriptions is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using devtu-optimize-descriptions should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/devtu-optimize-descriptions/SKILL.md --create-dirs "https://raw.githubusercontent.com/mims-harvard/ToolUniverse/main/skills/devtu-optimize-descriptions/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/devtu-optimize-descriptions/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How devtu-optimize-descriptions Compares

Feature / Agent	devtu-optimize-descriptions	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

# ToolUniverse Tool Description Optimization

Optimize tool descriptions in ToolUniverse JSON configuration files to ensure they are clear, complete, and user-friendly.

## When to Apply This Skill

Use when:
- Reviewing newly created tool descriptions
- User asks "are these tools easy to understand?"
- Improving existing tool documentation
- Adding new tools to ToolUniverse
- User mentions tool usability, clarity, or documentation

## Quick Optimization Checklist

```markdown
Tool Description Review:
- [ ] Prerequisites stated (packages, API keys, accounts)
- [ ] Critical abbreviations expanded on first use
- [ ] Required vs optional parameters clear
- [ ] Mutually exclusive options numbered/labeled
- [ ] Parameter guidance includes trade-offs
- [ ] Filter syntax shows available fields
- [ ] File size warnings where relevant
- [ ] Examples show realistic usage
```

## Critical Improvements (Fix Immediately)

### 1. Clarify Required Input Requirements

**Problem**: Users don't know if they need ONE input or ALL inputs.

**Fix**: Use "**Required: Provide ONE input type**" for mutually exclusive options.

```json
// Before
"description": "Process BED regions, motifs, or gene lists..."

// After
"description": "Process genomic data. **Required: Provide ONE input type** - (1) BED regions, (2) DNA motif, or (3) gene list. Analyzes..."
```

Number the options and use bold for "Required".

### 2. Add Prerequisites to First Tool

**Problem**: Users don't know what to install/configure before use.

**Fix**: Add prerequisites note to first tool in each family.

```json
"description": "Query single-cell data. Prerequisites: Requires 'package-name' (install: pip install tooluniverse[extra]). Returns..."
```

Include:
- Package installation command
- API key requirements
- Account creation instructions

### 3. Expand Critical Abbreviations

**Problem**: New users don't understand technical terms.

**Fix**: Expand on first use with format: "Abbreviation (Full Name)".

Common abbreviations to expand:
- H5AD → HDF5-based AnnData
- RPM → Reads Per Million
- TSS → Transcription Start Site
- TAD → Topologically Associating Domain
- DRS → Data Repository Service
- API names (MACS2, IUPAC, etc.)

```json
// Before
"description": "Download H5AD files..."

// After  
"description": "Download H5AD (HDF5-based AnnData) files..."
```

## High-Priority Improvements

### 4. Enhance Filter Parameter Descriptions

**Problem**: Users don't know what fields are available or what syntax to use.

**Fix**: List operators, common fields, and provide multiple examples.

```json
"parameter_name": {
  "type": "string",
  "description": "Filter using SQL-like syntax. Format: 'field == \"value\"'. Operators: ==, !=, in, <, >, <=, >=. Combine with 'and'/'or'. Common fields: tissue, cell_type, disease, assay, sex, ethnicity. Examples: 'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'."
}
```

Include:
- Syntax format
- Available operators
- List of 5-10 common fields
- 2-3 diverse examples

### 5. Improve Parameter Guidance

**Problem**: Users don't know which value to choose or what trade-offs exist.

**Fix**: Explain what each value means and provide recommendations.

```json
// Before
"threshold": "Q-value threshold (05=1e-5, 10=1e-10, 20=1e-20)"

// After
"threshold": "Peak calling stringency. '05'=1e-5 (permissive, more peaks, broad features), '10'=1e-10 (moderate, balanced), '20'=1e-20 (strict, high confidence, narrow peaks). Default '05' suitable for most analyses. Higher values = fewer but more confident peaks."
```

For each parameter option, explain:
- What it means practically
- When to use it
- Trade-offs involved
- Recommended default

### 6. Number Mutually Exclusive Options

**Problem**: Users provide multiple options when only one is allowed.

**Fix**: Label options as "**Option 1**", "**Option 2**", etc.

```json
"bed_data": {
  "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). Example: 'chr1\\t1000\\t2000'."
},
"motif": {
  "description": "**Option 2**: DNA sequence motif in IUPAC notation. Use: A/T/G/C, W=A|T, S=G|C. Example: 'CANNTG'."
},
"gene_list": {
  "description": "**Option 3**: Gene symbols as array. Example: ['TP53', 'MDM2']."
}
```

## Medium-Priority Improvements

### 7. Add File Size Warnings

For tools that download or return large files:

```json
"description": "Download contact matrices. Note: Files can be large (GBs), check file_size in metadata before downloading. Returns..."
```

### 8. Clarify Web Form vs API Results

When tool returns submission URL instead of direct results:

```json
"description": "Perform enrichment analysis. Note: Returns submission URL (web form-based analysis). Analyzes..."
```

### 9. Explain File Type Differences

For tools with multiple format options:

```json
"file_type": "File format. Common types: 'cooler' (multi-resolution contact matrices), 'pairs' (aligned read pairs), 'hic' (Juicer format), 'mcool' (multi-resolution cooler)."
```

## Description Structure Template

```json
{
  "name": "Tool_operation_name",
  "type": "ToolClassName",
  "description": "[Action verb] to [purpose]. [Prerequisites if first tool]. [Key data/features]. [Required inputs if mutually exclusive]. [Note about limitations/requirements]. Use for: [use case 1], [use case 2], [use case 3].",
  "parameter": {
    "properties": {
      "param_name": {
        "type": "string",
        "description": "[What it does]. [Format/syntax if applicable]. [Options with trade-offs]. [Examples]. [Recommendation if applicable]."
      }
    }
  }
}
```

## Description Quality Checklist

### Clarity Checks
- [ ] Purpose clear in first sentence
- [ ] Technical terms expanded
- [ ] Prerequisites stated upfront
- [ ] Examples show realistic usage
- [ ] "Use for:" section lists 3-5 concrete use cases

### Completeness Checks
- [ ] Required inputs clearly marked
- [ ] Parameter choices explained
- [ ] Limitations noted (file size, web form, etc.)
- [ ] Available fields listed for filters
- [ ] Default values recommended

### Usability Checks
- [ ] New users can understand without external docs
- [ ] Users know what to provide
- [ ] Users can make informed parameter choices
- [ ] Error prevention (mutually exclusive options labeled)

## Testing Description Quality

To verify description quality, ask:

1. **Can a new user understand what the tool does?**
   - Read only the description (no docs)
   - Should be clear within 30 seconds

2. **Can a user provide correct inputs on first try?**
   - Required inputs obvious
   - Format/syntax clear
   - Mutually exclusive options labeled

3. **Can a user choose appropriate parameters?**
   - Trade-offs explained
   - Recommendations provided
   - Defaults justified

4. **Are prerequisites obvious?**
   - Installation instructions
   - API keys/accounts
   - File size warnings

## Common Patterns by Tool Type

### API Query Tools
```json
"description": "Query [data type] from [source]. [Prerequisites]. Filter by [criteria]. Returns [output]. [Data scale]. Use for: [discovery], [analysis], [specific research tasks]."
```

Key elements:
- What you're querying
- How to filter
- What you get back
- Scale of data
- Prerequisites

### Data Download Tools
```json
"description": "Download [file types] from [source]. [Format details]. [File size warning]. [Authentication requirement]. Use for: [offline analysis], [custom processing], [integration]."
```

Key elements:
- File formats available
- Size warning
- Authentication needs
- What's in the files

### Enrichment/Analysis Tools
```json
"description": "Analyze [input type] to find [results]. **Required: Provide ONE input type** - (1) [option], (2) [option], (3) [option]. Compares against [database/background]. [Result format]. Use for: [identifying], [discovering], [predicting]."
```

Key elements:
- Input requirements clear
- Options numbered
- What gets compared
- What you learn

## Validation Commands

After updating descriptions, validate JSON syntax:

```bash
# Validate all tool JSONs
python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ Valid"

# Check all tools in category
for f in src/tooluniverse/data/*_tools.json; do
  python3 -m json.tool "$f" > /dev/null && echo "✓ $f valid" || echo "✗ $f invalid"
done
```

## Example: Before and After

**Before (Unclear):**
```json
{
  "name": "Tool_enrichment",
  "description": "Perform enrichment with tool to find factors.",
  "parameter": {
    "properties": {
      "bed": {"description": "BED data"},
      "motif": {"description": "Motif"},
      "genes": {"description": "Genes"},
      "threshold": {"description": "Threshold value"}
    }
  }
}
```

**After (Clear):**
```json
{
  "name": "Tool_enrichment_analysis",
  "description": "Identify transcription factors enriched in your data. **Required: Provide ONE input type** - (1) BED genomic regions, (2) DNA sequence motif (IUPAC notation), or (3) gene symbol list. Compares against 400,000+ ChIP-seq experiments. Returns ranked proteins with enrichment scores. Note: Returns submission URL (web-based analysis). Use for: identifying regulators of regions, finding proteins bound to motifs, discovering transcription factors regulating genes.",
  "parameter": {
    "properties": {
      "bed_data": {
        "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). For finding proteins bound to genomic regions. Example: 'chr1\\t1000\\t2000'."
      },
      "motif": {
        "description": "**Option 2**: DNA motif in IUPAC notation (A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T). Example: 'CANNTG' (E-box)."
      },
      "gene_list": {
        "description": "**Option 3**: Gene symbols as array or single gene. Example: ['TP53', 'MDM2', 'CDKN1A']."
      },
      "threshold": {
        "description": "Peak stringency. '05'=1e-5 (permissive, more peaks), '10'=1e-10 (moderate), '20'=1e-20 (strict, high confidence). Default '05' suitable for most analyses."
      }
    }
  }
}
```

## Summary

**Priority order for optimization:**

1. **Critical** (fix immediately):
   - Clarify required inputs
   - Add prerequisites
   - Expand abbreviations

2. **High** (fix soon):
   - Enhance filter descriptions
   - Improve parameter guidance
   - Number mutually exclusive options

3. **Medium** (nice to have):
   - Add file size warnings
   - Clarify web form vs API
   - Explain file type differences

**Expected impact**: 50-75% reduction in user errors, 50-67% faster time to first successful use.

Related Skills

devtu-self-evolve

1202

from mims-harvard/ToolUniverse

Orchestrate the full ToolUniverse self-improvement cycle: discover APIs, create tools, test with researcher personas, fix issues, optimize skills, and push via git. References and dispatches to all other devtu skills. Use when asked to: run the self-improvement loop, do a debug/test round, expand tool coverage, improve tool quality, or evolve ToolUniverse.

devtu-optimize-skills

1202

from mims-harvard/ToolUniverse

Optimize ToolUniverse skills for better report quality, evidence handling, and user experience. Apply patterns like tool verification, foundation data layers, disambiguation-first, evidence grading, quantified completeness, and report-only output. Use when reviewing skills, improving existing skills, or creating new ToolUniverse research skills.

devtu-github

1202

from mims-harvard/ToolUniverse

GitHub workflow for ToolUniverse - push code safely by moving temp files, activating pre-commit hooks, running tests, and cleaning staged files. Use when pushing to GitHub, fixing CI failures, or cleaning up before commits.

devtu-fix-tool

1202

from mims-harvard/ToolUniverse

Fix failing ToolUniverse tools by diagnosing test failures, identifying root causes, implementing fixes, and validating solutions. Use when ToolUniverse tools fail tests, return errors, have schema validation issues, or when asked to debug or fix tools in the ToolUniverse framework.

devtu-docs-quality

1202

from mims-harvard/ToolUniverse

TOP PRIORITY skill — find and immediately fix or remove every piece of wrong, outdated, or redundant information in ToolUniverse docs. Wrong code, broken links, incorrect counts, and overlapping instructions must be fixed or removed — never left in place. Runs five phases: (D) static method scan, (C) live code execution, (A) automated validation, (B) ToolUniverse audit, (E) less-is-more simplification. Core philosophy: each concept appears exactly once; remove don't add; no emojis; single setup entry point. Use when reviewing docs, before releases, after API changes, or when asked to audit, fix, or simplify documentation.

devtu-create-tool

1202

from mims-harvard/ToolUniverse

Create new scientific tools for ToolUniverse framework with proper structure, validation, and testing. Use when users need to add tools to ToolUniverse, implement new API integrations, create tool wrappers for scientific databases/services, expand ToolUniverse capabilities, or follow ToolUniverse contribution guidelines. Supports creating tool classes, JSON configurations, validation, error handling, and test examples.

devtu-code-optimization

1202

from mims-harvard/ToolUniverse

Code quality patterns and guidelines for ToolUniverse tool development. Apply when writing, fixing, or refactoring tool Python code in the ToolUniverse project. Encodes lessons from 80+ debug rounds. Use alongside devtu-fix-tool and devtu-self-evolve. Triggers: implementing tool fixes, writing new tool classes, reviewing tool code quality, checking schema correctness, looking up API-specific bug fixes.

devtu-auto-discover-apis

1202

from mims-harvard/ToolUniverse

Automatically discover life science APIs online, create ToolUniverse tools, validate them, and prepare integration PRs. Performs gap analysis to identify missing tool categories, web searches for APIs, automated tool creation using devtu-create-tool patterns, validation with devtu-fix-tool, and git workflow management. Use when expanding ToolUniverse coverage, adding new API integrations, or systematically discovering scientific resources.

tooluniverse

1202

from mims-harvard/ToolUniverse

Router skill for ToolUniverse tasks. First checks if specialized tooluniverse skills (105+ skills covering disease/drug/target research, gene-disease associations, clinical decision support, genomics, epigenomics, proteomics, comparative genomics, chemical safety, toxicology, systems biology, and more) can solve the problem, then falls back to general strategies for using 2300+ scientific tools. Covers tool discovery, multi-hop queries, comprehensive research workflows, disambiguation, evidence grading, and report generation. Use when users need to research any scientific topic, find biological data, or explore drug/target/disease relationships. ALSO USE for any biology, medicine, chemistry, pharmacology, or life science question — even simple factoid questions like "how many X in protein Y", "what drug interacts with Z", "what gene causes disease W", or "translate this sequence". These questions benefit from database lookups (UniProt, PubMed, ChEMBL, ClinVar, GWAS Catalog, etc.) rather than answering from memory alone. When in doubt about a scientific fact, USE THIS SKILL to verify against real databases.

tooluniverse-variant-to-mechanism

1202

from mims-harvard/ToolUniverse

End-to-end variant-to-mechanism analysis: given a genetic variant (rsID or coordinates), trace its functional impact from regulatory context (GWAS, eQTL, RegulomeDB, ENCODE) through target gene identification (GTEx, OpenTargets L2G) to downstream pathway and disease biology (STRING, Reactome, GO enrichment, disease associations). Produces an evidence-graded mechanistic narrative linking genotype to phenotype. Use when asked "how does this variant cause disease?", "what is the mechanism of rs7903146?", "trace variant to pathway", or "connect this GWAS hit to biology".

tooluniverse-variant-interpretation

1202

from mims-harvard/ToolUniverse

Systematic clinical variant interpretation from raw variant calls to ACMG-classified recommendations with structural impact analysis. Aggregates evidence from ClinVar, gnomAD, CIViC, UniProt, and PDB across ACMG criteria. Produces pathogenicity scores (0-100), clinical recommendations, and treatment implications. Use when interpreting genetic variants, classifying variants of uncertain significance (VUS), performing ACMG variant classification, or translating variant calls to clinical actionability.

tooluniverse-variant-functional-annotation

1202

from mims-harvard/ToolUniverse

Comprehensive functional annotation of protein variants — pathogenicity, population frequency, structural context, and clinical significance. Integrates ProtVar (map_variant, get_function, get_population) for protein-level mapping and structural context, ClinVar for clinical classifications, gnomAD for population frequency with ancestry data, CADD for deleteriousness scores, and ClinGen for gene-disease validity. Produces a structured variant annotation report with evidence grading. Use when asked about protein variant impact, missense variant pathogenicity, ProtVar annotation, variant functional context, or combining population and structural evidence for a variant.