adb-navigation-base
Base navigation patterns for Android device automation - gestures, waits, and UI interaction
Best use case
adb-navigation-base is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Base navigation patterns for Android device automation - gestures, waits, and UI interaction
Teams using adb-navigation-base should expect a more consistent output, faster repeated execution, less prompt rewriting, better workflow continuity with your supporting tools.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
- You already have the supporting tools or dependencies needed by this skill.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/adb-navigation-base/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How adb-navigation-base Compares
| Feature / Agent | adb-navigation-base | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Base navigation patterns for Android device automation - gestures, waits, and UI interaction
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
---
## Quick Reference (30 seconds)
**Base Navigation Patterns for Android Automation**
**What It Does**: Provides fundamental gesture and wait-for patterns used in all UI automation. Smart retry logic, verification, and timeout handling.
**Core Capabilities**:
- 👆 **Smart Tap**: Retry-enabled tapping with verification
- 👉 **Swipe Gesture**: Directional swiping for scrolling
- ⏱️ **Wait For**: Element/text waiting with polling
**When to Use**:
- Building reliable automation workflows
- Handling timing-sensitive UI transitions
- Creating reusable navigation patterns
- Combining with adb-screen-detection for full automation
---
## Scripts
### 1. adb-tap.py
Smart tap action with automatic retry logic.
```bash
# Basic tap
uv run .claude/skills/adb-navigation-base/scripts/adb-tap.py --x 100 --y 200
# Tap with retry (auto-retry on failure)
uv run .claude/skills/adb-navigation-base/scripts/adb-tap.py --x 100 --y 200 --retry 3
# Tap with verification
uv run .claude/skills/adb-navigation-base/scripts/adb-tap.py --x 100 --y 200 --verify-text "Next"
# Tap with both
uv run .claude/skills/adb-navigation-base/scripts/adb-tap.py \
--x 100 --y 200 --retry 3 --verify-text "Loaded" --timeout 5
```
**Features**:
- Automatic retry on failure (configurable attempts)
- Post-tap screen verification
- Timeout protection
- Detailed logging
---
### 2. adb-swipe.py
Perform directional swipe gesture.
```bash
# Swipe up (scroll down content)
uv run .claude/skills/adb-navigation-base/scripts/adb-swipe.py --direction up --distance 300
# Swipe left (next page)
uv run .claude/skills/adb-navigation-base/scripts/adb-swipe.py --direction left --distance 500
# Swipe with verification
uv run .claude/skills/adb-navigation-base/scripts/adb-swipe.py \
--direction down --distance 400 --verify-text "End of List"
# Custom start position
uv run .claude/skills/adb-navigation-base/scripts/adb-swipe.py \
--direction up --distance 300 --start-x 500 --start-y 1000
```
**Directions**: up, down, left, right
**Features**:
- 4-direction swiping
- Configurable distance (pixels)
- Custom start position
- Post-swipe verification
- Momentum-aware (handles scroll inertia)
---
### 3. adb-wait-for.py
Wait for UI element or text to appear.
```bash
# Wait for text (default 10s timeout)
uv run .claude/skills/adb-navigation-base/scripts/adb-wait-for.py \
--method text --target "Loaded"
# Wait for element by text with timeout
uv run .claude/skills/adb-navigation-base/scripts/adb-wait-for.py \
--method text --target "Login" --timeout 5
# Wait with polling interval
uv run .claude/skills/adb-navigation-base/scripts/adb-wait-for.py \
--method text --target "Results" --interval 0.5
# JSON output for scripting
uv run .claude/skills/adb-navigation-base/scripts/adb-wait-for.py \
--method text --target "Ready" --json
```
**Detection Methods**: text (OCR-based)
**Features**:
- Configurable timeout (1-60s)
- Polling interval configuration
- Timeout exception handling
- Success/failure reporting
- Return element coordinates when found
---
## Usage Patterns
### Pattern 1: Basic Navigation Sequence
```bash
# 1. Tap login button
adb-tap.py --x 100 --y 200
# 2. Wait for login form
adb-wait-for.py --method text --target "Username" --timeout 5
# 3. Swipe down to see password field
adb-swipe.py --direction down --distance 200
# 4. Tap password field
adb-tap.py --x 100 --y 300
```
### Pattern 2: Reliable Action with Retry
```bash
# Tap with automatic retry (up to 3 times)
adb-tap.py --x 100 --y 200 --retry 3 --verify-text "Next Screen"
```
### Pattern 3: Scroll and Find
```bash
# 1. Start at top
# 2. Swipe up (scroll down) multiple times looking for element
for i in {1..5}; do
adb-swipe.py --direction up --distance 300
# Check if element appears
if adb-wait-for.py --method text --target "Find Me" --timeout 1; then
# Found! Tap it
adb-tap.py --x 150 --y 400
break
fi
done
```
---
## Architecture
**Design Principles**:
- **Composable**: Chain together for complex workflows
- **Fault-Tolerant**: Automatic retry and timeout protection
- **Verifiable**: All actions can have post-action verification
- **Observable**: Detailed logging of all operations
- **Stateless**: No dependency on previous state
**Dependency Relationship**:
```
adb-screen-detection/ (provides capture, OCR, find-element)
↓
adb-navigation-base/ (uses for wait-for and verification)
↓
adb-workflow-orchestrator/ (coordinates these patterns)
```
---
## Integration Points
**Used By**:
- `adb-workflow-orchestrator` - Workflow step execution
- `adb-magisk` - Magisk Manager navigation
- `adb-karrot` - Karrot app automation
**Dependency On**:
- `adb-screen-detection` - Screen capture and OCR
- System: `adb` command-line tool
---
## Error Handling
### Retry Logic
```
Attempt 1: Try action
├─ Success → Return
└─ Failure → Wait, Try again
Attempt 2: Try action
├─ Success → Return
└─ Failure → Wait, Try again
Attempt 3: Try action
├─ Success → Return
└─ Failure → Return error
```
### Timeout Handling
- Polls every N seconds (configurable)
- Returns TIMEOUT error if deadline exceeded
- Allows caller to handle timeout gracefully
---
## Workflows
This skill includes TOON-based workflow definitions for automation.
### What is TOON?
TOON (Task-Oriented Orchestration Notation) is a structured workflow definition language that pairs with Markdown documentation. Each workflow consists of:
- **[name].toon** - Orchestration logic and execution steps
- **[name].md** - Complete documentation and usage guide
This TOON+MD pairing approach is inspired by the BMAD METHOD pattern, adapted to use TOON instead of YAML for better orchestration support.
### Available Workflows
Workflow files are located in `workflow/` directory:
**Example Workflows (adb-navigation-base):**
- `workflow/tap-and-wait.toon` - Tap element and wait for result
- `workflow/scroll-and-find.toon` - Scroll to find element
- `workflow/gesture-sequence.toon` - Complex gesture combinations
### Running a Workflow
Execute any workflow using the ADB workflow orchestrator:
```bash
uv run .claude/skills/adb-workflow-orchestrator/scripts/adb-run-workflow.py \
--workflow .claude/skills/adb-navigation-base/workflow/tap-and-wait.toon \
--param device="127.0.0.1:5555"
```
### Workflow Documentation
Each workflow includes comprehensive documentation in the corresponding `.md` file:
- Purpose and use case
- Prerequisites and requirements
- Available parameters
- Execution phases and steps
- Success criteria
- Error handling and recovery
- Example commands
See the `workflow/` directory for complete TOON file definitions and documentation.
### Creating New Workflows
To create custom workflows for this skill:
1. Create a new `.toon` file in the `workflow/` directory
2. Define phases, steps, and parameters using TOON v4.0 syntax
3. Create corresponding `.md` file with comprehensive documentation
4. Test with the workflow orchestrator
For more information, refer to the TOON specification and the workflow orchestrator documentation.
---
**Version**: 1.0.0
**Status**: ✅ Foundation Tier
**Scripts**: 3
**Last Updated**: 2025-12-01
**Tier**: 2 (Foundation)Related Skills
adr-knowledge-base
ADR知見の体系的参照・適用。主要ADR抜粋(ADR_010, 013, 016, 019, 020, 021)・ADR検索・参照方法・技術決定パターン集・ADR作成判断基準。Phase C以降の技術決定時に使用。
add-database-table
Create database migrations, define table schemas, and manage constant tables (enums). Use when: (1) creating a new database table, (2) running 'make migrate.create', (3) adding enum/status values, (4) modifying table structure. REQUIRED first step before add-domain-entity.
acsets-algebraic-databases
ACSets (Attributed C-Sets): Algebraic databases as in-memory data structures. Category-theoretic formalism for relational databases generalizing graphs and data frames.
account-based-marketing-agent
AI агент для ABM. Используй для автоматизации ABM кампаний и персонализации outreach.
account-based-blueprint
Framework for consolidating manufacturing account intelligence, buying centers, and pursuit strategy.
obsidian-bases
Create and edit Obsidian Bases (.base files) with views, filters, formulas, and summaries. Use when working with .base files, creating database-like views of notes, or when the user mentions Bases, table views, card views, filters, or formulas in Obsidian.
Build Your Database Skill
Create a relational-db-agent skill that knows SQLModel async patterns
Backend Database Expert
专注于数据库设计、SQL 优化和迁移策略。
grail-miner
This skill assists in setting up, managing, and optimizing Grail miners on Bittensor Subnet 81, handling tasks like environment configuration, R2 storage, model checkpoint management, and performance tuning.
ux
This AI agent skill provides comprehensive guidance for creating professional and insightful User Experience (UX) designs, covering user research, information architecture, interaction design, visual guidance, and usability evaluation. It aims to produce actionable, user-centered solutions that avoid generic AI aesthetics.
ontopo
An AI agent skill to search for Israeli restaurants, check table availability, view menus, and retrieve booking links via the Ontopo platform, acting as an unofficial interface to its data.
whisper-transcribe
Transcribes audio and video files to text using OpenAI's Whisper CLI, enhanced with contextual grounding from local markdown files for improved accuracy.