ai-tools

Command-line tools that delegate analysis tasks to AI models. Includes image description, screenshot comparison, smart cropping around people, token counting, essay generation from text, boolean condition evaluation, context gathering, and Android UI interaction via popper. Use for describing images, comparing UI states, cropping photos around faces, counting tokens, generating reports, evaluating conditions, gathering context for analysis, automating Android apps, testing Wear OS, or any task requiring AI inference. Triggers: ai analysis, describe image, compare screenshots, smart crop, crop around people, face crop, count tokens, token count, generate essay, evaluate condition, alt text, image description, UI comparison, visual diff, satisfies condition, boolean evaluation, gemini, context, gather context, research topic, android ui, adb, uiautomator2, popper, automate app, test wear os.

47 stars

byithinkihaveacat

View on GitHub Installation ↓

Best use case

ai-tools is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using ai-tools should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-tools/SKILL.md --create-dirs "https://raw.githubusercontent.com/ithinkihaveacat/dotfiles/main/skills/ai-tools/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ai-tools/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ai-tools Compares

Feature / Agent	ai-tools	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Tools

## Important: Use Scripts First

**ALWAYS prefer the scripts in `scripts/` over raw `curl` API calls.** Scripts
are located in the `scripts/` subdirectory of this skill's folder. They provide
features that raw commands do not:

- Proper image encoding (WebP conversion, alpha removal)
- Appropriate model selection for each task
- Structured output handling (boolean responses via exit codes)
- Meaningful exit codes for shell integration

**When to read the script source:** If a script doesn't do exactly what you
need, or fails due to missing dependencies, read the script source. The scripts
encode Gemini API best practices (image ordering, structured output schemas,
model selection) that may not be obvious—use them as reference when building
similar functionality.

## Quick Start

**Environment:** Set `GEMINI_API_KEY` before running any commands.

**Dependencies:** `curl`, `jq`, `uv` (all tools); `base64`, `magick` (image
tools only)

```bash
# Gather context and analyze
scripts/context gemini-api | scripts/emerson "Explain the key features"

# Describe an image (generate alt-text)
scripts/screenshot-describe screenshot.png

# Compare two images for visual differences
scripts/screenshot-compare before.png after.png

# Smart crop image around detected people
scripts/photo-smart-crop photo.jpg cropped.jpg

# Check if a photo prominently features people
scripts/photo-has-people photo.jpg

# Generate essay-length analysis from text
scripts/emerson "Summarize the key changes" < documentation.md

# Evaluate a boolean condition against text
echo "Hello world" | scripts/satisfies "is a greeting"

# Count tokens in text
cat document.md | scripts/token-count

# Interact with an Android UI via AI
scripts/popper "start an exercise"
```

## Script Overview

### context

Gathers authoritative, up-to-date context for deep research on various technical
topics (e.g., `gemini-api`, `mcp`, `home-assistant`). Run with `--list` to see
all available topics. This script should be your first tool for gathering
background knowledge or the latest documentation for an unfamiliar domain.

**Warning:** Output can be very large. **Do not** read output directly into your
conversation history. Pipe to `emerson` for analysis, or redirect to a file to
search/read locally.

```bash
scripts/context TOPIC
```

**Options:** `--list` (list available topics)

**Exit codes:** 0 success, 1 error, 127 missing dependency

**Examples:**

```bash
# List available topics
scripts/context --list

# Gather context for Gemini API
scripts/context gemini-api > gemini-context.xml

# Pipe context directly to analysis
scripts/context gemini-cli | scripts/emerson "How do commands work?"
```

### screenshot-describe

Generate concise alt-text for an image. Optimized for UI captures.

```bash
scripts/screenshot-describe IMAGE [PROMPT]
```

**Exit codes:** 0 success, 1 error, 127 missing dependency

### screenshot-compare

Compare two images for visual differences. Identifies layout shifts, color
changes, padding, and text updates.

```bash
scripts/screenshot-compare IMAGE1 IMAGE2 [PROMPT]
```

**Exit codes:** 0 differences found, 1 error, 2 images identical, 127 missing
dependency

### photo-smart-crop

Smart crop images around detected people with a specified aspect ratio.
Prioritizes faces, expands for headroom, enforces aspect ratio.

```bash
scripts/photo-smart-crop [--ratio W:H] INPUT OUTPUT
```

**Options:** `--ratio W:H` (default 5:3)

**Exit codes:** 0 success, 1 error (no people found, API error), 2 rate limited,
127 missing dependency

**Examples:**

```bash
# Default 5:3 aspect ratio
scripts/photo-smart-crop family.jpg family-cropped.jpg

# 16:9 for video thumbnails
scripts/photo-smart-crop --ratio 16:9 portrait.jpg thumbnail.jpg

# Square crop for profile pictures
scripts/photo-smart-crop --ratio 1:1 headshot.png avatar.png
```

### photo-has-people

Detect if people feature prominently in a photo. Returns boolean via exit code.

```bash
scripts/photo-has-people IMAGE
```

**Options:** `-q, --quiet` (suppress output)

**Exit codes:** 0 true (has people), 1 false (no people), 127 missing dependency

**Examples:**

```bash
# Check if photo has people
if scripts/photo-has-people photo.jpg; then
  echo "Found people"
fi
```

### emerson

Generate essay-length (~3000 words) analysis from text input. Produces
authoritative, footnoted Markdown. Can be combined with `context` to provide
rich background material.

```bash
scripts/emerson "PROMPT" < input.txt
```

**Exit codes:** 0 success, 1 error, 127 missing dependency

### pascal

Ask a question and get a short, paragraph-style response (wrapped to 80
columns). Optimized for quick answers.

```bash
scripts/pascal "QUESTION"
```

**Input:** Optional context via stdin

**Exit codes:** 0 success, 1 error, 127 missing dependency

**Examples:**

```bash
# Ask a quick question
scripts/pascal "What is the capital of Peru?"

# Summarize a file
cat article.md | scripts/pascal "Summarize this article"

# Explain code
scripts/pascal "Explain this code" < script.sh
```

### satisfies

Evaluate whether input text satisfies a condition. Returns boolean via exit
code.

```bash
echo "text" | scripts/satisfies [-v|--verbose] "CONDITION"
```

**Options:** `-v, --verbose` (output "true" or "false" to stderr)

**Exit codes:** 0 true (satisfies), 1 false (does not satisfy), 127 missing
dependency

**Examples:**

```bash
# Check if file mentions a topic
cat file.txt | scripts/satisfies "mentions Elvis" && echo "Found it"

# Validate content type
cat response.json | scripts/satisfies "is valid JSON with an 'id' field"

# Use in conditionals
if cat log.txt | scripts/satisfies "contains error messages"; then
  echo "Errors detected"
fi
```

### token-count

Count tokens in text using the Gemini API.

```bash
cat file.txt | scripts/token-count
```

**Exit codes:** 0 success, 1 error, 127 missing dependency

### popper

Interact with Android UIs using an AI agent powered by `uiautomator2` and
Gemini. This allows semantic control of the device by providing a goal in
natural language.

```bash
scripts/popper "GOAL"
```

**Options:** `--launch PACKAGE` (launch a package before starting),
`--stay-in-app` (restrict the run to a single application package),
`--dump-layout` (print the current simplified UI layout as JSON and exit)

**Environment:** `ANDROID_SERIAL` (optional, target specific device)

**Exit codes:** 0 success (task completed), 1 error (task failed), 2 timeout

**Examples:**

```bash
# General UI task
scripts/popper "accept all permissions"

# Launch an app and keep the run inside it
scripts/popper --launch com.example.fitness --stay-in-app "start a running exercise"

# Target specific device
env ANDROID_SERIAL=12345 scripts/popper "open settings"
```

## Image Encoding Notes

- Images converted to lossless WebP for consistent encoding
- Alpha channel removed (`-alpha off`) so transparency-only differences are
  ignored
- Base64: use `-w 0` (Linux) or `-b 0` (macOS) for single-line output
- Single-image prompts: image before text (Gemini best practice)
- Multi-image comparison: text before images (Gemini best practice)

## Safety Notes

- Scripts require network access to the Gemini API
- `GEMINI_API_KEY` must be set in the environment
- API calls may incur usage costs
- Large images increase request size and latency
- Scripts do not store or log input data

## Reference Material

- **Command Reference**: Detailed documentation for each script. See
  [references/command-index.md](references/command-index.md).
- **Troubleshooting**: Common issues and solutions. See
  [references/troubleshooting.md](references/troubleshooting.md).

Related Skills

wear-testing

from ithinkihaveacat/dotfiles

Provides a guide and ADB commands for testing Wear OS applications. Focuses on triggering system state changes, simulating edge cases, and interacting with Wear-specific surfaces (tiles, complications, watchfaces). Triggers: wear os, testing, wear os testing, test wear os app, adb, pixel watch, galaxy watch.

technical-writing-style

from ithinkihaveacat/dotfiles

Use this skill when authoring, reviewing, or editing technical documents, including bug reports, known issues, friction logs, PR descriptions, and the structural content and tone of commit messages. Use to ensure engineering content maintains a clear, factual, and constructive tone. Triggers: technical writing, bug report, known issue, friction log, PR description, pull request, commit message tone, review document.

jetpack

from ithinkihaveacat/dotfiles

Resolves AndroidX/Jetpack library information including version lookup, package-to-Maven-coordinate conversion, and source code downloading. Provides tools for inspecting Jetpack library implementations. Use when working with androidx libraries, resolving Maven coordinates, downloading Jetpack source code, checking library versions (alpha/beta/stable/snapshot), or inspecting AndroidX class implementations. Triggers: androidx, jetpack, maven coordinate, jetpack source, library version, snapshot, alpha, beta.

emumanager

from ithinkihaveacat/dotfiles

Manages Android SDK, emulators, and AVDs. Use when bootstrapping Android SDK, creating/starting/stopping AVDs, downloading system images, or troubleshooting emulator issues. Supports mobile, Wear OS, TV, and Automotive devices. Covers sdkmanager, avdmanager, emulator CLI. Triggers: android emulator, android virtual device, avd, system image, wear os emulator, tv emulator, automotive emulator, bootstrap android sdk.

coding-standards

from ithinkihaveacat/dotfiles

Use this skill when writing, reviewing, or validating code (shell scripts, Python, Markdown) or CLI tools to ensure they follow repository coding standards and conventions. Also use when formatting git commit messages (Conventional Commits syntax, line wrapping) or checking code for style compliance. Triggers: coding standards, style guide, validate change, review conventions, shellcheck, shfmt, markdown format, python, ruff, uvx, lint, commit message format, CLI design, code review, formatting.

adb

from ithinkihaveacat/dotfiles

Manipulates Android devices via ADB with emphasis on Wear OS. Provides scripts for screenshots, screen recording, tile management, WearableService inspection, package operations, and device configuration. Use when working with adb, Android devices, Wear OS watches, tiles, wearable data layer, dumpsys, or device debugging. Triggers: adb, android device, wear os, wearable, tile, screenshot, screen recording, dumpsys, logcat.

n8n-mcp-tools-expert

31392

from sickn33/antigravity-awesome-skills

Expert guide for using n8n-mcp MCP tools effectively. Use when searching for nodes, validating configurations, accessing templates, managing workflows, or using any n8n-mcp tool. Provides tool selection guidance, parameter formats, and common patterns.

Workflow AutomationClaude

china-tools-sourcing

3891

from openclaw/skills

Comprehensive tools industry sourcing guide for international buyers – provides detailed information about China's hand tools, power tools, garden tools, measuring tools, and industrial tool manufacturing clusters, supply chain structure, regional specializations, and industry trends (2026 updated).

github-tools

3891

from openclaw/skills

Interact with GitHub using the `gh` CLI. Use `gh issue`, `gh pr`, `gh run`, and `gh api` for issues, PRs, CI runs, and advanced queries.

DevOps & Infrastructure

ecc-tools-cost-audit

144923

from affaan-m/everything-claude-code

Evidence-first ECC Tools burn and billing audit workflow. Use when investigating runaway PR creation, quota bypass, premium-model leakage, duplicate jobs, or GitHub App cost spikes in the ECC Tools repo.

scanning-tools

31392

from sickn33/antigravity-awesome-skills

Master essential security scanning tools for network discovery, vulnerability assessment, web application testing, wireless security, and compliance validation. This skill covers tool selection, configuration, and practical usage across different scanning categories.

red-team-tools

31392

from sickn33/antigravity-awesome-skills

Implement proven methodologies and tool workflows from top security researchers for effective reconnaissance, vulnerability discovery, and bug bounty hunting. Automate common tasks while maintaining thorough coverage of attack surfaces.