windows-screenshot-ocr

Windows全屏截图(自动标记鼠标位置)+ 原生OCR文字识别。完全本地运行,无需联网,无需API Key。适用于需要截图分析屏幕内容、自动化OCR识别的场景。

3,891 stars

Best use case

windows-screenshot-ocr is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Windows全屏截图(自动标记鼠标位置)+ 原生OCR文字识别。完全本地运行,无需联网,无需API Key。适用于需要截图分析屏幕内容、自动化OCR识别的场景。

Teams using windows-screenshot-ocr should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/windows-screenshot-ocr/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/a770438678/windows-screenshot-ocr/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/windows-screenshot-ocr/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How windows-screenshot-ocr Compares

Feature / Agentwindows-screenshot-ocrStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Windows全屏截图(自动标记鼠标位置)+ 原生OCR文字识别。完全本地运行,无需联网,无需API Key。适用于需要截图分析屏幕内容、自动化OCR识别的场景。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Windows Screenshot + OCR Skill

本技能提供两个核心功能:
1. **全屏截图**:截取当前屏幕并在截图上标记鼠标位置(红色准星)
2. **OCR文字识别**:使用 Windows 系统内置 OCR 引擎识别图片中的文字

## 环境要求

- Windows 10 / 11(64位)
- Python 3.8+
- 已安装中文/英文 OCR 语言包(系统设置 → 语言)

## 安装依赖

```bash
pip install mss pyautogui Pillow
pip install winrt
```

## 使用方法

### 截图
```bash
python screenshot.py
```
截图保存在 `E:\桌面\auto_screenshot\`,文件名带时间戳。

### OCR识别
```bash
python windows_ocr.py
```
修改脚本中的 `image_path` 为目标图片路径,识别结果保存到 `ocr_result.txt`。

## 文件说明

- `screenshot.py` — 截图脚本,带鼠标位置标记
- `windows_ocr.py` — OCR识别脚本,使用Windows原生引擎
- `README.md` — 详细说明文档

## 注意事项

- 截图路径默认为 `E:\桌面\auto_screenshot\`,可在脚本中修改 `save_folder`
- OCR 依赖 Windows 系统语言包,如识别失败请在系统设置中添加对应语言
- 完全本地运行,不联网,不上传任何数据

## 作者

QClaw AI Assistant(由用户对话生成,2026-03-26)

Related Skills

windows-ui-controller

3891
from openclaw/skills

Windows 软件自动化控制技能包 - 使用 pywinauto 控制微信/QQ/网易云等任何 Windows 应用。包含完整教程、依赖包、最佳实践。

OpenClaw Install Guide (WSL2 Windows)

3891
from openclaw/skills

Complete step-by-step installation guide for OpenClaw on Windows 10/11 with WSL2, includes common pitfalls and solutions from real installation experience.

web-screenshot

3891
from openclaw/skills

Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.

wsl-chrome-cdp - WSL2 访问 Windows Chrome 浏览器

3891
from openclaw/skills

**版本:** 1.0.0

windows-healing-gateway

3891
from openclaw/skills

OpenClaw Gateway Self-Healing System for Windows

macpilot-screenshot-ocr

3891
from openclaw/skills

Capture screenshots and extract text via OCR using MacPilot. Take full-screen, region, or window screenshots, and recognize text in images or screen areas with multi-language support.

screenshot-tool

3891
from openclaw/skills

网页截图 + 文档截图工具。支持网页全页截图、PPT/Word/Excel/PDF 转高清图片。保留原始样式,300 DPI 高清输出。

windows-tts

3891
from openclaw/skills

在 Windows 11 上"直接发声"的 TTS(从 WSL2/TUI 调用 powershell.exe + System.Speech)。适用于用户说"说出来/读出来/语音播报/用TTS",或反馈"没声音/tts 生成的 mp3 是空的/播不出来",以及需要中文语音但 OpenClaw 内置 tts 不可用时。

screenshot-ux-auditor

3891
from openclaw/skills

Turn app screenshots into structured UX, copywriting, and conversion audits with issue severity and recommended fixes.

screenshot-to-task

3891
from openclaw/skills

把截图里的待办或灵感整理成任务、备注和优先级。;use for screenshots, tasks, capture workflows;do not use for 伪造截图内容, 替代 OCR 系统.

app-store-screenshots-generator

3817
from openclaw/skills

Generate production-ready App Store screenshots for iOS apps using AI agents, Next.js, and html-to-image

---

3891
from openclaw/skills

name: article-factory-wechat

Content & Documentation