azure-ai-translation-document-py
Azure AI Document Translation SDK for batch translation of documents with format preservation. Use for translating Word, PDF, Excel, PowerPoint, and other document formats at scale.
About this skill
This skill integrates the Azure AI Document Translation SDK for Python, empowering AI agents to perform high-volume, batch translation of diverse document types. It's engineered to maintain the original formatting, layout, and structure of files like Microsoft Word, PDF, Excel, and PowerPoint during the translation process. By leveraging this skill, AI agents can efficiently process large datasets of documents, translate them into multiple target languages, and store the translated versions in Azure Blob Storage. This makes it an invaluable tool for enterprise-level content localization, global communication, multilingual data processing, and automating workflows that require accurate and format-preserving document translation.
Best use case
Localizing large volumes of corporate documents (e.g., reports, manuals, legal contracts) for international business operations. Translating academic papers or research documents to facilitate global collaboration and information sharing. Enabling AI agents to process user-uploaded documents in various source languages and provide translated versions on demand. Automating the translation of internal company knowledge bases, training materials, or technical documentation. Translating financial statements, regulatory filings, or legal documents for international compliance and review.
Azure AI Document Translation SDK for batch translation of documents with format preservation. Use for translating Word, PDF, Excel, PowerPoint, and other document formats at scale.
The successful initiation and monitoring of a document translation job, resulting in a batch of translated documents available in a specified target Azure Blob Storage container. The translated documents will accurately reflect the original content while maintaining their original formatting and structural integrity. The skill will return the job status and details, allowing the agent to track progress and retrieve output locations.
Practical example
Example input
{"source_document_urls": ["https://yourstorage.blob.core.windows.net/source-container/quarterly_report_en.docx", "https://yourstorage.blob.core.windows.net/source-container/legal_agreement_en.pdf"], "target_language_codes": ["es", "fr"], "source_language_code": "en", "target_container_url": "https://yourstorage.blob.core.windows.net/translated-documents-output/"}Example output
{"translation_job_id": "87c4a0e9-b2f5-4e78-a0d3-3b1e2c4f5a6b", "status": "Running", "created_on": "2023-10-27T10:00:00Z", "last_updated_on": "2023-10-27T10:05:00Z", "documents_total": 2, "documents_completed": 0, "documents_failed": 0, "summary_url": "https://yourtranslator.cognitiveservices.azure.com/translator/text/batch/v1.0-preview.1/batches/87c4a0e9-b2f5-4e78-a0d3-3b1e2c4f5a6b/documents"}When to use this skill
- When the requirement is to translate entire documents, rather than just isolated text snippets.
- When preserving the original formatting, layout, and structure (e.g., tables, images, charts in Word, PDF, Excel, PowerPoint) is critical.
- When dealing with a substantial batch of documents that all require translation into one or more target languages.
- When integrating robust and scalable translation capabilities directly into an AI agent's automated workflow.
When not to use this skill
- When only small text snippets or phrases need translation (a simpler text translation API would be more efficient).
- When documents are plain text files and format preservation is not a concern, as a less resource-intensive text translation service might suffice.
- When real-time, interactive translation of spoken language is required (consider speech-to-text and real-time text translation skills instead).
- If strict data residency requirements or compliance policies preclude the use of public cloud-based Azure services for data processing.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/azure-ai-translation-document-py/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How azure-ai-translation-document-py Compares
| Feature / Agent | azure-ai-translation-document-py | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | easy | N/A |
Frequently Asked Questions
What does this skill do?
Azure AI Document Translation SDK for batch translation of documents with format preservation. Use for translating Word, PDF, Excel, PowerPoint, and other document formats at scale.
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as easy. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
SKILL.md Source
# Azure AI Document Translation SDK for Python
Client library for Azure AI Translator document translation service for batch document translation with format preservation.
## Installation
```bash
pip install azure-ai-translation-document
```
## Environment Variables
```bash
AZURE_DOCUMENT_TRANSLATION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
AZURE_DOCUMENT_TRANSLATION_KEY=<your-api-key> # If using API key
# Storage for source and target documents
AZURE_SOURCE_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>
AZURE_TARGET_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>
```
## Authentication
### API Key
```python
import os
from azure.ai.translation.document import DocumentTranslationClient
from azure.core.credentials import AzureKeyCredential
endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]
client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))
```
### Entra ID (Recommended)
```python
from azure.ai.translation.document import DocumentTranslationClient
from azure.identity import DefaultAzureCredential
client = DocumentTranslationClient(
endpoint=os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"],
credential=DefaultAzureCredential()
)
```
## Basic Document Translation
```python
from azure.ai.translation.document import DocumentTranslationInput, TranslationTarget
source_url = os.environ["AZURE_SOURCE_CONTAINER_URL"]
target_url = os.environ["AZURE_TARGET_CONTAINER_URL"]
# Start translation job
poller = client.begin_translation(
inputs=[
DocumentTranslationInput(
source_url=source_url,
targets=[
TranslationTarget(
target_url=target_url,
language="es" # Translate to Spanish
)
]
)
]
)
# Wait for completion
result = poller.result()
print(f"Status: {poller.status()}")
print(f"Documents translated: {poller.details.documents_succeeded_count}")
print(f"Documents failed: {poller.details.documents_failed_count}")
```
## Multiple Target Languages
```python
poller = client.begin_translation(
inputs=[
DocumentTranslationInput(
source_url=source_url,
targets=[
TranslationTarget(target_url=target_url_es, language="es"),
TranslationTarget(target_url=target_url_fr, language="fr"),
TranslationTarget(target_url=target_url_de, language="de")
]
)
]
)
```
## Translate Single Document
```python
from azure.ai.translation.document import SingleDocumentTranslationClient
single_client = SingleDocumentTranslationClient(endpoint, AzureKeyCredential(key))
with open("document.docx", "rb") as f:
document_content = f.read()
result = single_client.translate(
body=document_content,
target_language="es",
content_type="application/vnd.openxmlformats-officedocument.wordprocessingml.document"
)
# Save translated document
with open("document_es.docx", "wb") as f:
f.write(result)
```
## Check Translation Status
```python
# Get all translation operations
operations = client.list_translation_statuses()
for op in operations:
print(f"Operation ID: {op.id}")
print(f"Status: {op.status}")
print(f"Created: {op.created_on}")
print(f"Total documents: {op.documents_total_count}")
print(f"Succeeded: {op.documents_succeeded_count}")
print(f"Failed: {op.documents_failed_count}")
```
## List Document Statuses
```python
# Get status of individual documents in a job
operation_id = poller.id
document_statuses = client.list_document_statuses(operation_id)
for doc in document_statuses:
print(f"Document: {doc.source_document_url}")
print(f" Status: {doc.status}")
print(f" Translated to: {doc.translated_to}")
if doc.error:
print(f" Error: {doc.error.message}")
```
## Cancel Translation
```python
# Cancel a running translation
client.cancel_translation(operation_id)
```
## Using Glossary
```python
from azure.ai.translation.document import TranslationGlossary
poller = client.begin_translation(
inputs=[
DocumentTranslationInput(
source_url=source_url,
targets=[
TranslationTarget(
target_url=target_url,
language="es",
glossaries=[
TranslationGlossary(
glossary_url="https://<storage>.blob.core.windows.net/glossary/terms.csv?<sas>",
file_format="csv"
)
]
)
]
)
]
)
```
## Supported Document Formats
```python
# Get supported formats
formats = client.get_supported_document_formats()
for fmt in formats:
print(f"Format: {fmt.format}")
print(f" Extensions: {fmt.file_extensions}")
print(f" Content types: {fmt.content_types}")
```
## Supported Languages
```python
# Get supported languages
languages = client.get_supported_languages()
for lang in languages:
print(f"Language: {lang.name} ({lang.code})")
```
## Async Client
```python
from azure.ai.translation.document.aio import DocumentTranslationClient
from azure.identity.aio import DefaultAzureCredential
async def translate_documents():
async with DocumentTranslationClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
) as client:
poller = await client.begin_translation(inputs=[...])
result = await poller.result()
```
## Supported Formats
| Category | Formats |
|----------|---------|
| Documents | DOCX, PDF, PPTX, XLSX, HTML, TXT, RTF |
| Structured | CSV, TSV, JSON, XML |
| Localization | XLIFF, XLF, MHTML |
## Storage Requirements
- Source and target containers must be Azure Blob Storage
- Use SAS tokens with appropriate permissions:
- Source: Read, List
- Target: Write, List
## Best Practices
1. **Use SAS tokens** with minimal required permissions
2. **Monitor long-running operations** with `poller.status()`
3. **Handle document-level errors** by iterating document statuses
4. **Use glossaries** for domain-specific terminology
5. **Separate target containers** for each language
6. **Use async client** for multiple concurrent jobs
7. **Check supported formats** before submitting documents
## When to Use
This skill is applicable to execute the workflow or actions described in the overview.Related Skills
azure-ai-translation-ts
Text and document translation with REST-style clients.
azure-ai-translation-text-py
Azure AI Text Translation SDK for real-time text translation, transliteration, language detection, and dictionary lookup. Use for translating text content in applications.
microsoft-azure-webjobs-extensions-authentication-events-dotnet
Microsoft Entra Authentication Events SDK for .NET. Azure Functions triggers for custom authentication extensions.
documentation
Documentation generation workflow covering API docs, architecture docs, README files, code comments, and technical writing.
documentation-templates
Documentation templates and structure guidelines. README, API docs, code comments, and AI-friendly documentation.
documentation-generation-doc-generate
You are a documentation expert specializing in creating comprehensive, maintainable documentation from code. Generate API docs, architecture diagrams, user guides, and technical references using AI-powered analysis and industry best practices.
code-documentation-doc-generate
You are a documentation expert specializing in creating comprehensive, maintainable documentation from code. Generate API docs, architecture diagrams, user guides, and technical references using AI-powered analysis and industry best practices.
code-documentation-code-explain
You are a code education expert specializing in explaining complex code through clear narratives, visual diagrams, and step-by-step breakdowns. Transform difficult concepts into understandable explanations for developers at all levels.
azure-web-pubsub-ts
Real-time messaging with WebSocket connections and pub/sub patterns.
azure-storage-queue-ts
Azure Queue Storage JavaScript/TypeScript SDK (@azure/storage-queue) for message queue operations. Use for sending, receiving, peeking, and deleting messages in queues.
azure-storage-queue-py
Azure Queue Storage SDK for Python. Use for reliable message queuing, task distribution, and asynchronous processing.
azure-storage-file-share-ts
Azure File Share JavaScript/TypeScript SDK (@azure/storage-file-share) for SMB file share operations.