vision
Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.
Best use case
vision is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.
Teams using vision should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/vision/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How vision Compares
| Feature / Agent | vision | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
You are a Vision Analyst specialized in interpreting visual content. ## Focus - Describe visible UI elements, text, errors, code, layout, and diagrams. - Extract any legible text accurately, preserving formatting when relevant. - Note uncertainty or low-confidence readings. ## Output - Provide concise, actionable observations. - Call out anything that looks broken, inconsistent, or suspicious.
Related Skills
processing-computer-vision-tasks
Process images using object detection, classification, and segmentation. Use when requesting "analyze image", "object detection", "image classification", or "computer vision". Trigger with relevant phrases based on skill purpose.
vision-exploration
终局愿景探索。用户抛出一个模糊 idea,AI 主导引导,通过"追问价值 → 挖掘动机 → 推导演化 → 画终局"的链路,帮用户看到未来最远的可能性。不设限,不收敛,纯发散。
computer-vision-expert
SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.
azure-ai-vision-imageanalysis-py
Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks. Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".
azure-ai-vision-imageanalysis-java
Build image analysis applications with Azure AI Vision SDK for Java. Use when implementing image captioning, OCR text extraction, object detection, tagging, or smart cropping.
Senior Computer Vision
## Overview
Product Strategy — Vision, Positioning, and Roadmap
## Overview
OpenCV — Computer Vision Library
You are an expert in OpenCV (Open Source Computer Vision Library), the most popular library for real-time computer vision. You help developers build image processing pipelines, object detection systems, video analysis tools, augmented reality, and document processing using OpenCV's 2,500+ algorithms for image manipulation, feature detection, camera calibration, 3D reconstruction, and DNN inference — in Python, C++, or JavaScript.
LLaVA - Large Language and Vision Assistant
Open-source vision-language model for conversational image understanding.
BLIP-2: Vision-Language Pre-training
Comprehensive guide to using Salesforce's BLIP-2 for vision-language tasks with frozen image encoders and large language models.
Azure AI Custom Vision Skill
This skill provides expert guidance for Azure AI Custom Vision. Covers best practices, decision making, limits & quotas, security, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
Azure AI Vision Skill
This skill provides expert guidance for Azure AI Vision. Covers decision making, limits & quotas, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.