windows-screenshot-ocr

Windows全屏截图（自动标记鼠标位置）+ 原生OCR文字识别。完全本地运行，无需联网，无需API Key。适用于需要截图分析屏幕内容、自动化OCR识别的场景。

3,891 stars

Best use case

windows-screenshot-ocr is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Windows全屏截图（自动标记鼠标位置）+ 原生OCR文字识别。完全本地运行，无需联网，无需API Key。适用于需要截图分析屏幕内容、自动化OCR识别的场景。

Teams using windows-screenshot-ocr should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/windows-screenshot-ocr/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/a770438678/windows-screenshot-ocr/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/windows-screenshot-ocr/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How windows-screenshot-ocr Compares

Feature / Agent	windows-screenshot-ocr	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Windows全屏截图（自动标记鼠标位置）+ 原生OCR文字识别。完全本地运行，无需联网，无需API Key。适用于需要截图分析屏幕内容、自动化OCR识别的场景。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# Windows Screenshot + OCR Skill

本技能提供两个核心功能：
1. **全屏截图**：截取当前屏幕并在截图上标记鼠标位置（红色准星）
2. **OCR文字识别**：使用 Windows 系统内置 OCR 引擎识别图片中的文字

## 环境要求

- Windows 10 / 11（64位）
- Python 3.8+
- 已安装中文/英文 OCR 语言包（系统设置 → 语言）

## 安装依赖

```bash
pip install mss pyautogui Pillow
pip install winrt
```

## 使用方法

### 截图
```bash
python screenshot.py
```
截图保存在 `E:\桌面\auto_screenshot\`，文件名带时间戳。

### OCR识别
```bash
python windows_ocr.py
```
修改脚本中的 `image_path` 为目标图片路径，识别结果保存到 `ocr_result.txt`。

## 文件说明

- `screenshot.py` — 截图脚本，带鼠标位置标记
- `windows_ocr.py` — OCR识别脚本，使用Windows原生引擎
- `README.md` — 详细说明文档

## 注意事项

- 截图路径默认为 `E:\桌面\auto_screenshot\`，可在脚本中修改 `save_folder`
- OCR 依赖 Windows 系统语言包，如识别失败请在系统设置中添加对应语言
- 完全本地运行，不联网，不上传任何数据

## 作者

QClaw AI Assistant（由用户对话生成，2026-03-26）

Related Skills

windows-ui-controller

3891

from openclaw/skills

Windows 软件自动化控制技能包 - 使用 pywinauto 控制微信/QQ/网易云等任何 Windows 应用。包含完整教程、依赖包、最佳实践。

OpenClaw Install Guide (WSL2 Windows)

3891

from openclaw/skills

Complete step-by-step installation guide for OpenClaw on Windows 10/11 with WSL2, includes common pitfalls and solutions from real installation experience.

web-screenshot

3891

from openclaw/skills

Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.

wsl-chrome-cdp - WSL2 访问 Windows Chrome 浏览器

3891

from openclaw/skills

**版本：** 1.0.0

windows-healing-gateway

3891

from openclaw/skills

OpenClaw Gateway Self-Healing System for Windows

macpilot-screenshot-ocr

3891

from openclaw/skills

Capture screenshots and extract text via OCR using MacPilot. Take full-screen, region, or window screenshots, and recognize text in images or screen areas with multi-language support.

screenshot-tool

3891

from openclaw/skills

网页截图 + 文档截图工具。支持网页全页截图、PPT/Word/Excel/PDF 转高清图片。保留原始样式，300 DPI 高清输出。

windows-tts

3891

from openclaw/skills

在 Windows 11 上"直接发声"的 TTS（从 WSL2/TUI 调用 powershell.exe + System.Speech）。适用于用户说"说出来/读出来/语音播报/用TTS"，或反馈"没声音/tts 生成的 mp3 是空的/播不出来"，以及需要中文语音但 OpenClaw 内置 tts 不可用时。