vision

Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.

242 stars

byaiskillstore

View on GitHub Installation ↓

Best use case

vision is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.

Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "vision" skill to help with this workflow task. Context: Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/vision/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/0xsero/vision/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/vision/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How vision Compares

Feature / Agent	vision	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Analyze images, screenshots, diagrams, and visual content - Use when you need to understand visual content like screenshots, architecture diagrams, UI mockups, or error screenshots.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

You are a Vision Analyst specialized in interpreting visual content.

## Focus
- Describe visible UI elements, text, errors, code, layout, and diagrams.
- Extract any legible text accurately, preserving formatting when relevant.
- Note uncertainty or low-confidence readings.

## Output
- Provide concise, actionable observations.
- Call out anything that looks broken, inconsistent, or suspicious.

Related Skills

vision-exploration

242

from aiskillstore/marketplace

终局愿景探索。用户抛出一个模糊 idea，AI 主导引导，通过"追问价值 → 挖掘动机 → 推导演化 → 画终局"的链路，帮用户看到未来最远的可能性。不设限，不收敛，纯发散。

computer-vision-expert

242

from aiskillstore/marketplace

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

azure-ai-vision-imageanalysis-py

242

from aiskillstore/marketplace

Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks. Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".

azure-ai-vision-imageanalysis-java

242

from aiskillstore/marketplace

Build image analysis applications with Azure AI Vision SDK for Java. Use when implementing image captioning, OCR text extraction, object detection, tagging, or smart cropping.

vision-analysis

242

from aiskillstore/marketplace

Analyze, describe, and extract information from images using the MiniMax vision MCP tool. Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png, .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image: "analyze", "analyse", "describe", "explain", "understand", "look at", "review", "extract text", "OCR", "what is in", "what's in", "read this image", "see this image", "tell me about", "explain this", "interpret this", in connection with an image, screenshot, diagram, chart, mockup, wireframe, or photo. Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction from charts, object detection, person/animal/activity identification. Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg), or any request to analyze/describ/understand/review/extract text from an image, screenshot, diagram, chart, photo, mockup, or wireframe.

senior-computer-vision

242

from aiskillstore/marketplace

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

azure-quotas

242

from aiskillstore/marketplace

Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".

DevOps & Infrastructure

raindrop-io

242

from aiskillstore/marketplace

Manage Raindrop.io bookmarks with AI assistance. Save and organize bookmarks, search your collection, manage reading lists, and organize research materials. Use when working with bookmarks, web research, reading lists, or when user mentions Raindrop.io.

Data & Research

zlibrary-to-notebooklm

242

from aiskillstore/marketplace

自动从 Z-Library 下载书籍并上传到 Google NotebookLM。支持 PDF/EPUB 格式，自动转换，一键创建知识库。

discover-skills

242

from aiskillstore/marketplace

当你发现当前可用的技能都不够合适（或用户明确要求你寻找技能）时使用。本技能会基于任务目标和约束，给出一份精简的候选技能清单，帮助你选出最适配当前任务的技能。

web-performance-seo

242

from aiskillstore/marketplace

Fix PageSpeed Insights/Lighthouse accessibility "!" errors caused by contrast audit failures (CSS filters, OKLCH/OKLAB, low opacity, gradient text, image backgrounds). Use for accessibility-driven SEO/performance debugging and remediation.

project-to-obsidian

242

from aiskillstore/marketplace

将代码项目转换为 Obsidian 知识库。当用户提到 obsidian、项目文档、知识库、分析项目、转换项目时激活。【激活后必须执行】： 1. 先完整阅读本 SKILL.md 文件 2. 理解 AI 写入规则（默认到 00_Inbox/AI/、追加式、统一 Schema） 3. 执行 STEP 0: 使用 AskUserQuestion 询问用户确认 4. 用户确认后才开始 STEP 1 项目扫描 5. 严格按 STEP 0 → 1 → 2 → 3 → 4 顺序执行【禁止行为】： - 禁止不读 SKILL.md 就开始分析项目 - 禁止跳过 STEP 0 用户确认 - 禁止直接在 30_Resources 创建（先到 00_Inbox/AI/） - 禁止自作主张决定输出位置