powerskills-desktop

Windows desktop automation. Take full-screen or window screenshots, list/focus/minimize/maximize windows, send keystrokes, launch applications. Use when needing to capture the Windows screen, manage windows, send keyboard input, or start programs.

3,891 stars

Best use case

powerskills-desktop is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Windows desktop automation. Take full-screen or window screenshots, list/focus/minimize/maximize windows, send keystrokes, launch applications. Use when needing to capture the Windows screen, manage windows, send keyboard input, or start programs.

Teams using powerskills-desktop should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/desktop/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/aloth/powerskills/skills/desktop/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/desktop/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How powerskills-desktop Compares

Feature / Agentpowerskills-desktopStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Windows desktop automation. Take full-screen or window screenshots, list/focus/minimize/maximize windows, send keystrokes, launch applications. Use when needing to capture the Windows screen, manage windows, send keyboard input, or start programs.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# PowerSkills — Desktop

Desktop automation: screenshots, window management, keystrokes, app launching.

## Requirements

- Windows with .NET Framework (System.Windows.Forms, System.Drawing)

## Actions

```powershell
.\powerskills.ps1 desktop <action> [--params]
```

| Action | Params | Description |
|--------|--------|-------------|
| `screenshot` | `--out-file path.png [--window "title"]` | Full screen or window capture |
| `windows` | | List all visible windows with title, PID, process name |
| `focus` | `--window "title"` | Bring window to foreground |
| `minimize` | `--window "title"` | Minimize window |
| `maximize` | `--window "title"` | Maximize window |
| `keys` | `--keys "{ENTER}" [--window "title"]` | Send keystrokes (SendKeys syntax) |
| `launch` | `--app notepad [--app-args "file.txt"] [--wait-ms 3000]` | Launch application |

## Examples

```powershell
# Full screen screenshot
.\powerskills.ps1 desktop screenshot --out-file screen.png

# Capture a specific window
.\powerskills.ps1 desktop screenshot --out-file outlook.png --window "Outlook"

# List all windows
.\powerskills.ps1 desktop windows

# Focus and type into Notepad
.\powerskills.ps1 desktop focus --window "Notepad"
.\powerskills.ps1 desktop keys --keys "Hello world{ENTER}" --window "Notepad"

# Launch an app
.\powerskills.ps1 desktop launch --app "notepad.exe" --app-args "C:\temp\notes.txt"
```

## SendKeys Syntax

| Key | Syntax |
|-----|--------|
| Enter | `{ENTER}` |
| Tab | `{TAB}` |
| Escape | `{ESC}` |
| Ctrl+C | `^c` |
| Alt+F4 | `%{F4}` |
| Shift+Tab | `+{TAB}` |

See [Microsoft SendKeys docs](https://learn.microsoft.com/en-us/dotnet/api/system.windows.forms.sendkeys) for full syntax.

## Output Fields

### windows
`title`, `pid`, `process`, `hwnd`

### screenshot
`saved`, `width`, `height`, `window` (if window capture)

Related Skills

desktop-monitor-widget

3891
from openclaw/skills

桌面监控悬浮球 - 实时显示系统资源状态

General Utilities

desktop-control

3891
from openclaw/skills

Advanced desktop automation with mouse, keyboard, and screen control. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

desktop-sandbox

3891
from openclaw/skills

A desktop sandbox lets OpenClaw run as natively as on a real OS, ensuring full functionality with safe isolation.Run OpenClaw without breaking your PC.

desktop-agent-ops

3891
from openclaw/skills

Execute cross-platform desktop tasks through a packaged desktop automation skill that guides the main agent to observe the screen, focus apps and windows, call helper scripts for screenshots and input actions, verify each step, clean up task context, and only escalate to multi-agent collaboration when tasks become clearly multi-window or multi-app. Use when the user wants desktop GUI control, native app operation, window focus, screenshots, click and type flows, or cross-platform desktop workflows on macOS, Windows, or Linux.

powerskills-system

3891
from openclaw/skills

Windows system commands and info via PowerShell. Execute shell commands, get system info (hostname, OS, uptime), list top processes, read environment variables. Use when needing to run commands, check system status, or inspect the Windows environment.

powerskills-outlook

3891
from openclaw/skills

Outlook email and calendar automation via COM. Read inbox, unread, sent items. Search emails. Send, reply, draft. List calendar events and mail folders. Use when needing to check work email, read/send Outlook messages, search mail, or view calendar. Requires Outlook desktop app on Windows.

powerskills-browser

3891
from openclaw/skills

Edge browser automation via Chrome DevTools Protocol (CDP). List tabs, navigate, take screenshots, extract page content/HTML, execute JavaScript, click elements, type text, fill forms, scroll. Use when needing to control Edge browser, scrape web content, automate web forms, or take browser screenshots on Windows. Requires Edge with --remote-debugging-port=9222.

powerskills

3891
from openclaw/skills

Windows automation toolkit for AI agents. Provides Outlook email/calendar, Edge browser (CDP), desktop screenshots/window management, and shell commands via PowerShell. Install this for the full suite, or install individual sub-skills (powerskills-outlook, powerskills-browser, powerskills-desktop, powerskills-system) separately.

desktop-control-custom

3891
from openclaw/skills

Advanced desktop automation with mouse, keyboard, and screen control

desktop-music-launcher

3891
from openclaw/skills

检索本机已安装音乐软件,启动它,并根据用户需求推荐、搜索或播放歌曲;在 macOS 上可用 AppleScript 控制 Spotify 和 Apple Music,并为 Spotify 增加可选的精确点播链路。

desktop-cleanup-playbook

3891
from openclaw/skills

为桌面文件生成整理方案、分类规则和阶段性清理计划,先分析再行动。;use for desktop, cleanup, organization workflows;do not use for 自动删除桌面文件, 越权访问系统目录.

---

3891
from openclaw/skills

name: article-factory-wechat

Content & Documentation