visa-doc-translate

将签证申请文件(图片)翻译成英文,并创建包含原文和译文的双语PDF

144,923 stars
Complexity: easy

About this skill

This skill is designed to assist with visa applications by automating the complex process of translating image-based documents into English. It takes an image file path (including HEIC format) as input and performs a series of steps without requiring user confirmation. These steps include: automatically converting HEIC files to PNG, rotating images based on EXIF data (or inferring rotation if needed), performing OCR using multiple robust methods (macOS Vision framework, EasyOCR, Tesseract) to extract all text, and professionally translating the content into English while preserving original structure, using appropriate terminology for visa applications, retaining original proper nouns with English in parentheses, and accurately handling names, numbers, dates, and amounts. Finally, it generates a bilingual PDF where the first page displays the processed original image and the second page presents the structured English translation.

Best use case

Users applying for visas often need to translate supporting documents (e.g., proof of employment, bank statements, retirement certificates) from their native language into English. This skill streamlines that process by providing an automated, accurate, and professionally formatted translation within a bilingual PDF.

将签证申请文件(图片)翻译成英文,并创建包含原文和译文的双语PDF

A well-structured bilingual PDF document for each input image. Page 1 will display the processed, correctly oriented original image of the document. Page 2 will contain a professional English translation of the document's content, formatted to reflect the original structure, with proper nouns handled appropriately and all numbers, dates, and amounts accurately preserved.

Practical example

Example input

User provides the path to a visa application document image, e.g., `/Users/affaan/Documents/visa/proof_of_employment.heic` or `/home/user/visa_docs/bank_statement.png`

Example output

A generated PDF file, for example, `proof_of_employment_translated.pdf`, containing:
*   Page 1: The original (rotated and converted if necessary) image of the employment proof document, centered and scaled to fit an A4 page.
*   Page 2: A professional English translation of the document's text, maintaining the original structure, using specialized visa-related terminology, and providing bilingual text for proper nouns (e.g., WU Zhengye).

When to use this skill

  • Use this skill when you have image files (like scans or photos) of visa application documents in a non-English language (e.g., Chinese) and require a professional, structured English translation for submission. It is particularly useful for documents where maintaining original formatting and professional terminology is crucial.

When not to use this skill

  • Do not use this skill for highly sensitive legal documents that explicitly require a certified human translation. Avoid using it if the original document is not an image file (e.g., a pure text document) or if the target translation language is not English. It's also not suitable for documents unrelated to visa applications where specialized domain-specific translation is not a priority.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/visa-doc-translate/SKILL.md --create-dirs "https://raw.githubusercontent.com/affaan-m/everything-claude-code/main/docs/zh-CN/skills/visa-doc-translate/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/visa-doc-translate/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How visa-doc-translate Compares

Feature / Agentvisa-doc-translateStandard Approach
Platform SupportClaudeLimited / Varies
Context Awareness High Baseline
Installation ComplexityeasyN/A

Frequently Asked Questions

What does this skill do?

将签证申请文件(图片)翻译成英文,并创建包含原文和译文的双语PDF

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

您正在协助翻译用于签证申请的签证申请文件。

## 说明

当用户提供图像文件路径时,**自动**执行以下步骤,**无需**请求确认:

1. **图像转换**:如果文件是 HEIC 格式,使用 `sips -s format png <input> --out <output>` 将其转换为 PNG

2. **图像旋转**:
   * 检查 EXIF 方向数据
   * 根据 EXIF 数据自动旋转图像
   * 如果 EXIF 方向是 6,则逆时针旋转 90 度
   * 根据需要应用额外旋转(如果文档看起来上下颠倒,则测试 180 度)

3. **OCR 文本提取**:
   * 自动尝试多种 OCR 方法:
     * macOS Vision 框架(macOS 首选)
     * EasyOCR(跨平台,无需 tesseract)
     * Tesseract OCR(如果可用)
   * 从文档中提取所有文本信息
   * 识别文档类型(存款证明、在职证明、退休证明等)

4. **翻译**:
   * 专业地将所有文本内容翻译成英文
   * 保持原始文档的结构和格式
   * 使用适合签证申请的专业术语
   * 保留专有名词的原始语言,并在括号内附上英文
   * 对于中文姓名,使用拼音格式(例如,WU Zhengye)
   * 准确保留所有数字、日期和金额

5. **PDF 生成**:
   * 使用 PIL 和 reportlab 库创建 Python 脚本
   * 第 1 页:显示旋转后的原始图像,居中并缩放到适合 A4 页面
   * 第 2 页:以适当格式显示英文翻译:
     * 标题居中并加粗
     * 内容左对齐,间距适当
     * 适合官方文件的专业布局
   * 在底部添加注释:"This is a certified English translation of the original document"
   * 执行脚本以生成 PDF

6. **输出**:在同一目录中创建名为 `<original_filename>_Translated.pdf` 的 PDF 文件

## 支持的文档

* 银行存款证明 (存款证明)
* 收入证明 (收入证明)
* 在职证明 (在职证明)
* 退休证明 (退休证明)
* 房产证明 (房产证明)
* 营业执照 (营业执照)
* 身份证和护照
* 其他官方文件

## 技术实现

### OCR 方法(按顺序尝试)

1. **macOS Vision 框架**(仅限 macOS):
   ```python
   import Vision
   from Foundation import NSURL
   ```

2. **EasyOCR**(跨平台):
   ```bash
   pip install easyocr
   ```

3. **Tesseract OCR**(如果可用):
   ```bash
   brew install tesseract tesseract-lang
   pip install pytesseract
   ```

### 必需的 Python 库

```bash
pip install pillow reportlab
```

对于 macOS Vision 框架:

```bash
pip install pyobjc-framework-Vision pyobjc-framework-Quartz
```

## 重要指南

* **请勿**在每个步骤都要求用户确认
* 自动确定最佳旋转角度
* 如果一种 OCR 方法失败,请尝试多种方法
* 确保所有数字、日期和金额都准确翻译
* 使用简洁、专业的格式
* 完成整个流程并报告最终 PDF 的位置

## 使用示例

```bash
/visa-doc-translate RetirementCertificate.PNG
/visa-doc-translate BankStatement.HEIC
/visa-doc-translate EmploymentLetter.jpg
```

## 输出示例

该技能将:

1. 使用可用的 OCR 方法提取文本
2. 翻译成专业英文
3. 生成 `<filename>_Translated.pdf`,其中包含:
   * 第 1 页:原始文档图像
   * 第 2 页:专业的英文翻译

非常适合需要翻译文件的澳大利亚、美国、加拿大、英国及其他国家的签证申请。

Related Skills

writer

31392
from sickn33/antigravity-awesome-skills

Document creation, format conversion (ODT/DOCX/PDF), mail merge, and automation with LibreOffice Writer.

Document ProcessingClaude

latex-paper-conversion

31392
from sickn33/antigravity-awesome-skills

This skill should be used when the user asks to convert an academic paper in LaTeX from one format (e.g., Springer, IPOL) to another format (e.g., MDPI, IEEE, Nature). It automates extraction, injection, fixing formatting, and compiling.

Document ProcessingClaude

docx-official

31392
from sickn33/antigravity-awesome-skills

A user may ask you to create, edit, or analyze the contents of a .docx file. A .docx file is essentially a ZIP archive containing XML files and other resources that you can read or edit. You have different tools and workflows available for different tasks.

Document ProcessingClaude

doc-cleaner

162
from notoriouslab/doc-cleaner

Convert PDF, DOCX, XLSX, and text files to clean, structured Markdown. CJK-friendly, table-friendly, privacy-first.

Document Processing

workspace-surface-audit

144923
from affaan-m/everything-claude-code

Audit the active repo, MCP servers, plugins, connectors, env surfaces, and harness setup, then recommend the highest-value ECC-native skills, hooks, agents, and operator workflows. Use when the user wants help setting up Claude Code or understanding what capabilities are actually available in their environment.

DevelopmentClaude

ui-demo

144923
from affaan-m/everything-claude-code

Record polished UI demo videos using Playwright. Use when the user asks to create a demo, walkthrough, screen recording, or tutorial video of a web application. Produces WebM videos with visible cursor, natural pacing, and professional feel.

Developer ToolsClaude

token-budget-advisor

144923
from affaan-m/everything-claude-code

Offers the user an informed choice about how much response depth to consume before answering. Use this skill when the user explicitly wants to control response length, depth, or token budget. TRIGGER when: "token budget", "token count", "token usage", "token limit", "response length", "answer depth", "short version", "brief answer", "detailed answer", "exhaustive answer", "respuesta corta vs larga", "cuántos tokens", "ahorrar tokens", "responde al 50%", "dame la versión corta", "quiero controlar cuánto usas", or clear variants where the user is explicitly asking to control answer size or depth. DO NOT TRIGGER when: user has already specified a level in the current session (maintain it), the request is clearly a one-word answer, or "token" refers to auth/session/payment tokens rather than response size.

Productivity & Content CreationClaude

skill-comply

144923
from affaan-m/everything-claude-code

Visualize whether skills, rules, and agent definitions are actually followed — auto-generates scenarios at 3 prompt strictness levels, runs agents, classifies behavioral sequences, and reports compliance rates with full tool call timelines

DevelopmentClaude

santa-method

144923
from affaan-m/everything-claude-code

Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships.

Quality AssuranceClaude

safety-guard

144923
from affaan-m/everything-claude-code

Use this skill to prevent destructive operations when working on production systems or running agents autonomously.

DevelopmentClaude

repo-scan

144923
from affaan-m/everything-claude-code

Cross-stack source code asset audit — classifies every file, detects embedded third-party libraries, and delivers actionable four-level verdicts per module with interactive HTML reports.

DevelopmentClaude

project-flow-ops

144923
from affaan-m/everything-claude-code

Operate execution flow across GitHub and Linear by triaging issues and pull requests, linking active work, and keeping GitHub public-facing while Linear remains the internal execution layer. Use when the user wants backlog control, PR triage, or GitHub-to-Linear coordination.

DevelopmentClaude