clawphone-phone-control

使用手机控制 MCP 完成手机界面感知与操作。适用于读取当前手机状态、打开 App、处理弹窗、点击控件、输入文本、排查手机自动化失败等场景。执行时优先读取界面状态,涉及坐标点击时必须基于当前截图临时判定,禁止把历史坐标当成通用规则。

3,891 stars

Best use case

clawphone-phone-control is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

使用手机控制 MCP 完成手机界面感知与操作。适用于读取当前手机状态、打开 App、处理弹窗、点击控件、输入文本、排查手机自动化失败等场景。执行时优先读取界面状态,涉及坐标点击时必须基于当前截图临时判定,禁止把历史坐标当成通用规则。

Teams using clawphone-phone-control should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/clawphone-phone-control/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/be1human/clawphone-phone-control/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/clawphone-phone-control/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How clawphone-phone-control Compares

Feature / Agentclawphone-phone-controlStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

使用手机控制 MCP 完成手机界面感知与操作。适用于读取当前手机状态、打开 App、处理弹窗、点击控件、输入文本、排查手机自动化失败等场景。执行时优先读取界面状态,涉及坐标点击时必须基于当前截图临时判定,禁止把历史坐标当成通用规则。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# ClawPhone Phone Control

## 快速流程

1. 先感知,再操作。
2. 优先用文字/节点能力,坐标点击只在当前截图下临时使用。
3. 脆弱流程必须一步一验,不要把多步盲打成“理应成功”。

## 感知优先级

1. 先用 `get_screen_info` 判断前台应用、分辨率、可见文字。
2. 需要精确定位时,再用 `capture_screen`。
3. 需要找标准控件时,优先 `click_by_text` 或 `find_node`。

## 操作原则

- 输入前先确认输入框真的已获焦。
- 启动 App 后先确认前台应用真的切换成功。
- 若工具返回“已点击”“已启动”“已输入”,仍要以界面复核为准。
- 发送前先确认文本真的已经进入输入框。
- 点击发送前先确认发送控件真实可见。
- 发送后必须再次确认界面已变化,不要仅凭工具返回值认定成功。

## 启动 App

1. 可先调用 `launch_app(...)`。
2. 立即用 `get_current_app` 或 `get_screen_info` 确认前台应用是否真的切换。
3. 若仍停留在桌面或其他 App,不要在同一路径上重复盲开。
4. 先读取当前屏幕状态,再决定是否改用桌面图标点击、通知入口或其他页面内入口。

## 输入兜底

1. 先确认输入框已聚焦。
2. 优先尝试 `type_text(...)`。
3. 若 `type_text(...)` 失败,或文本没有真正进入输入框,立即切到兜底链路:
   - `set_clipboard(...)`
   - 长按当前截图里的输入区域
   - 截图确认菜单
   - 基于当前截图临时点击 `粘贴`
4. 粘贴后再次确认文本真的进入输入框,再继续后续操作。

## 坐标原则

- 坐标只对应“当前设备、当前页面、当前截图”。
- 不要把一次成功的坐标写成固定流程。
- 如果页面一变、键盘弹出、工具栏切换,之前的坐标立即失效,应重新截图。

## 失败排查

按下面顺序定位失败点:

1. 是否在正确页面。
2. 是否点中了真实控件,而不是附近空白区域。
3. 输入框是否真的聚焦。
4. 文字是否真的进入输入框。
5. 发送按钮是否真的出现且可点击。
6. 操作后界面是否真的变化。
7. 某条路径失败后,是否及时切换到更合适的兜底路径,而不是重复试错。

## 微信等脆弱场景

- 聊天发送属于脆弱流程,默认一步一验。
- `press_enter` 只能作为兜底,不要把它当作“必然发送成功”的主路径。
- 微信自定义弹窗通常不在无障碍树中,遇到菜单项时应截图后临时取坐标点击。

## 参考

- 需要工具说明时,读 `tools-reference.md`。

Related Skills

Pest Control Operations Agent

3891
from openclaw/skills

You are an expert pest control business operations advisor. Help operators with licensing, EPA/FIFRA compliance, pricing, route optimization, seasonal planning, technician management, and growth strategy.

Business Management

Export Compliance & Trade Controls

3891
from openclaw/skills

Analyze products, destinations, and end-users against US export control regulations (EAR, ITAR, OFAC sanctions). Generate classification recommendations, license requirements, and compliance checklists.

Regulatory Compliance

ecovacs-robot-control

3891
from openclaw/skills

Control Ecovacs/DEEBOT robot vacuums via the Ecovacs IoT API. Use when the user wants to control a robot vacuum, check battery, start/stop/pause cleaning, return to dock, check clean status, set suction/water level, manage schedules, check consumables, or control auto-empty station. Covers all mainstream Ecovacs protocols including clean_V2, charge, getBattery, getCleanInfo_V2, getStats, getSpeed/setSpeed, getWaterInfo/setWaterInfo, getWorkMode/setWorkMode, getLifeSpan, getAutoEmpty/setAutoEmpty, getCachedMapInfo, getMapSet, getSched_V2/setSched_V2.

Smart Home & IoT

opencode-acp-control

3891
from openclaw/skills

Control OpenCode directly via the Agent Client Protocol (ACP). Start sessions, send prompts, resume conversations, and manage OpenCode updates.

openclaw-phone

3891
from openclaw/skills

Use CallMyCall API to start, end, and check AI phone calls, and return results in chat. Use when the user asks to call someone, plan a future call, end a call, or fetch call results.

clawphone-wechat-control

3891
from openclaw/skills

处理微信会话列表、进入聊天、发送消息、处理微信内弹窗与聊天页失败排查。适用于用户要求查看微信消息、回复联系人、转发、处理聊天输入框或发送失败时。执行时必须先确认当前在微信的哪个页面,再按聊天场景一步一验。

phone-calls

3891
from openclaw/skills

Make and manage real phone calls through Twilio. Handles outbound calls with a stated objective, monitors call progress, and returns transcripts and summaries. Use when the user wants to call someone, check on a call, or review call history.

desktop-control

3891
from openclaw/skills

Advanced desktop automation with mouse, keyboard, and screen control. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

controld

3891
from openclaw/skills

Manage Control D DNS filtering service via API. Use for DNS profile management, device configuration, custom blocking rules, service filtering, analytics settings, and network diagnostics. Triggers when user mentions Control D, DNS filtering, DNS blocking, device DNS setup, or managing DNS profiles.

phone-call

3891
from openclaw/skills

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

control-ikea-lightbulb

3891
from openclaw/skills

Control IKEA/TP-Link Kasa smart bulbs (set on/off, brightness, and color). Use when you want to programmatically control a local smart bulb by IP on the LAN.

intiface-control

3891
from openclaw/skills

Control 750+ BLE intimate devices (Lovense, Kiiroo, We-Vibe, Satisfyer, etc.) from natural language via Intiface Central and buttplug-mcp. Works on macOS, Windows, and Linux. No protocol reverse-engineering required.