latex-translation-guide

Translate LaTeX documents preserving math formulas and structure

191 stars

Best use case

latex-translation-guide is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Translate LaTeX documents preserving math formulas and structure

Teams using latex-translation-guide should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/latex-translation-guide/SKILL.md --create-dirs "https://raw.githubusercontent.com/wentorai/research-plugins/main/skills/tools/ocr-translate/latex-translation-guide/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/latex-translation-guide/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How latex-translation-guide Compares

Feature / Agentlatex-translation-guideStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Translate LaTeX documents preserving math formulas and structure

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# LaTeX Document Translation Guide

## Overview

Translating LaTeX academic documents requires preserving mathematical formulas, cross-references, citations, and formatting while converting the text between languages. This guide covers tools and techniques for translating LaTeX papers — from command-line utilities to full document pipelines. Particularly useful for making research accessible across language barriers.

## LaTeXTrans Approach

```bash
# Install LaTeXTrans
pip install latextrans

# Translate a LaTeX file
latextrans translate paper.tex --from en --to zh --output paper_zh.tex
```

### How It Works

1. **Parse**: Extract text segments while preserving LaTeX commands
2. **Protect**: Shield math environments (`$...$`, `\[...\]`, equations)
3. **Translate**: Send text segments to translation API
4. **Reconstruct**: Reassemble with original LaTeX structure

### Python Usage

```python
from latextrans import LatexTranslator

translator = LatexTranslator(
    source_lang="en",
    target_lang="zh",
    engine="google",  # or "deepl", "openai"
)

# Translate a file
translator.translate_file("paper.tex", "paper_zh.tex")

# Translate a string
result = translator.translate(
    r"The loss function $\mathcal{L}(\theta)$ is minimized "
    r"using gradient descent with learning rate $\eta$."
)
# Output preserves $\mathcal{L}(\theta)$ and $\eta$ untouched
```

## MathTranslate Tool

```bash
# Install MathTranslate (specialized for math-heavy papers)
pip install mathtranslate

# Translate arXiv paper directly
translate_arxiv 2301.00001 -o translated.tex

# Translate local file
translate_tex paper.tex -o paper_translated.tex
```

### MathTranslate Features

```python
# Configuration
import mathtranslate

# Set translation backend
mathtranslate.config.set_translator("google")  # free
mathtranslate.config.set_translator("openai")  # higher quality

# Translate with customization
mathtranslate.translate(
    input_file="paper.tex",
    output_file="paper_zh.tex",
    source_lang="en",
    target_lang="zh-CN",
    threads=4,  # parallel translation
)
```

## Manual Translation Tips

### Protecting Math Environments

```python
import re

def extract_and_protect(latex_text: str) -> tuple:
    """Extract math environments before translation."""
    math_pattern = r'(\$\$[\s\S]*?\$\$|\$[^$]+\$|\\begin\{equation\}[\s\S]*?\\end\{equation\}|\\begin\{align\}[\s\S]*?\\end\{align\})'

    placeholders = {}
    counter = [0]

    def replace_math(match):
        key = f"__MATH_{counter[0]}__"
        placeholders[key] = match.group(0)
        counter[0] += 1
        return key

    protected = re.sub(math_pattern, replace_math, latex_text)
    return protected, placeholders


def restore_math(translated: str, placeholders: dict) -> str:
    """Restore math environments after translation."""
    for key, value in placeholders.items():
        translated = translated.replace(key, value)
    return translated
```

### Commands to Protect

```latex
% Always protect these:
\ref{...}      % Cross-references
\cite{...}     % Citations
\label{...}    % Labels
\eqref{...}    % Equation references
\url{...}      % URLs
\texttt{...}   % Code/monospace

% Math environments to protect:
$...$          % Inline math
$$...$$        % Display math
\[...\]        % Display math
\begin{equation}...\end{equation}
\begin{align}...\end{align}
\begin{theorem}...\end{theorem}  % Custom environments
```

## Bilingual Output

```latex
% Create side-by-side bilingual document
\usepackage{paracol}

\begin{paracol}{2}
\switchcolumn[0]
The transformer architecture has become...

\switchcolumn[1]
Transformer架构已经成为...

\switchcolumn[0]
Self-attention computes $\text{Attention}(Q,K,V) = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V$

\switchcolumn[1]
自注意力计算 $\text{Attention}(Q,K,V) = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V$
\end{paracol}
```

## Translation Backends

| Backend | Quality | Cost | Speed |
|---------|---------|------|-------|
| Google Translate | Good | Free | Fast |
| DeepL | Better | Freemium | Fast |
| OpenAI GPT-4 | Best | Paid | Slower |
| Claude | Best | Paid | Slower |

## References

- [LaTeXTrans](https://github.com/SUSYUSTC/MathTranslate)
- [PDFMathTranslate](https://github.com/Byaidu/PDFMathTranslate)
- [arXiv LaTeX Cleaner](https://github.com/google-research/arxiv-latex-cleaner)