azure-ai-document-intelligence-dotnet
Azure AI Document Intelligence SDK for .NET. Extract text, tables, and structured data from documents using prebuilt and custom models. Use for invoice processing, receipt extraction, ID document analysis, and custom document models. Triggers: "Document Intelligence", "DocumentIntelligenceClient", "form recognizer", "invoice extraction", "receipt OCR", "document analysis .NET".
Best use case
azure-ai-document-intelligence-dotnet is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Azure AI Document Intelligence SDK for .NET. Extract text, tables, and structured data from documents using prebuilt and custom models. Use for invoice processing, receipt extraction, ID document analysis, and custom document models. Triggers: "Document Intelligence", "DocumentIntelligenceClient", "form recognizer", "invoice extraction", "receipt OCR", "document analysis .NET".
Teams using azure-ai-document-intelligence-dotnet should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/azure-ai-document-intelligence-dotnet/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How azure-ai-document-intelligence-dotnet Compares
| Feature / Agent | azure-ai-document-intelligence-dotnet | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Azure AI Document Intelligence SDK for .NET. Extract text, tables, and structured data from documents using prebuilt and custom models. Use for invoice processing, receipt extraction, ID document analysis, and custom document models. Triggers: "Document Intelligence", "DocumentIntelligenceClient", "form recognizer", "invoice extraction", "receipt OCR", "document analysis .NET".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Azure.AI.DocumentIntelligence (.NET)
Extract text, tables, and structured data from documents using prebuilt and custom models.
## Installation
```bash
dotnet add package Azure.AI.DocumentIntelligence
dotnet add package Azure.Identity
```
**Current Version**: v1.0.0 (GA)
## Environment Variables
```bash
DOCUMENT_INTELLIGENCE_ENDPOINT=https://<resource-name>.cognitiveservices.azure.com/
DOCUMENT_INTELLIGENCE_API_KEY=<your-api-key>
BLOB_CONTAINER_SAS_URL=https://<storage>.blob.core.windows.net/<container>?<sas-token>
```
## Authentication
### Microsoft Entra ID (Recommended)
```csharp
using Azure.Identity;
using Azure.AI.DocumentIntelligence;
string endpoint = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_ENDPOINT");
var credential = new DefaultAzureCredential();
var client = new DocumentIntelligenceClient(new Uri(endpoint), credential);
```
> **Note**: Entra ID requires a **custom subdomain** (e.g., `https://<resource-name>.cognitiveservices.azure.com/`), not a regional endpoint.
### API Key
```csharp
string endpoint = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_ENDPOINT");
string apiKey = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_API_KEY");
var client = new DocumentIntelligenceClient(new Uri(endpoint), new AzureKeyCredential(apiKey));
```
## Client Types
| Client | Purpose |
|--------|---------|
| `DocumentIntelligenceClient` | Analyze documents, classify documents |
| `DocumentIntelligenceAdministrationClient` | Build/manage custom models and classifiers |
## Prebuilt Models
| Model ID | Description |
|----------|-------------|
| `prebuilt-read` | Extract text, languages, handwriting |
| `prebuilt-layout` | Extract text, tables, selection marks, structure |
| `prebuilt-invoice` | Extract invoice fields (vendor, items, totals) |
| `prebuilt-receipt` | Extract receipt fields (merchant, items, total) |
| `prebuilt-idDocument` | Extract ID document fields (name, DOB, address) |
| `prebuilt-businessCard` | Extract business card fields |
| `prebuilt-tax.us.w2` | Extract W-2 tax form fields |
| `prebuilt-healthInsuranceCard.us` | Extract health insurance card fields |
## Core Workflows
### 1. Analyze Invoice
```csharp
using Azure.AI.DocumentIntelligence;
Uri invoiceUri = new Uri("https://example.com/invoice.pdf");
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(
WaitUntil.Completed,
"prebuilt-invoice",
invoiceUri);
AnalyzeResult result = operation.Value;
foreach (AnalyzedDocument document in result.Documents)
{
if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField)
&& vendorNameField.FieldType == DocumentFieldType.String)
{
string vendorName = vendorNameField.ValueString;
Console.WriteLine($"Vendor Name: '{vendorName}', confidence: {vendorNameField.Confidence}");
}
if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField)
&& invoiceTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue invoiceTotal = invoiceTotalField.ValueCurrency;
Console.WriteLine($"Invoice Total: '{invoiceTotal.CurrencySymbol}{invoiceTotal.Amount}'");
}
// Extract line items
if (document.Fields.TryGetValue("Items", out DocumentField itemsField)
&& itemsField.FieldType == DocumentFieldType.List)
{
foreach (DocumentField item in itemsField.ValueList)
{
var itemFields = item.ValueDictionary;
if (itemFields.TryGetValue("Description", out DocumentField descField))
Console.WriteLine($" Item: {descField.ValueString}");
}
}
}
```
### 2. Extract Layout (Text, Tables, Structure)
```csharp
Uri fileUri = new Uri("https://example.com/document.pdf");
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(
WaitUntil.Completed,
"prebuilt-layout",
fileUri);
AnalyzeResult result = operation.Value;
// Extract text by page
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Page {page.PageNumber}: {page.Lines.Count} lines, {page.Words.Count} words");
foreach (DocumentLine line in page.Lines)
{
Console.WriteLine($" Line: '{line.Content}'");
}
}
// Extract tables
foreach (DocumentTable table in result.Tables)
{
Console.WriteLine($"Table: {table.RowCount} rows x {table.ColumnCount} columns");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}): {cell.Content}");
}
}
```
### 3. Analyze Receipt
```csharp
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(
WaitUntil.Completed,
"prebuilt-receipt",
receiptUri);
AnalyzeResult result = operation.Value;
foreach (AnalyzedDocument document in result.Documents)
{
if (document.Fields.TryGetValue("MerchantName", out DocumentField merchantField))
Console.WriteLine($"Merchant: {merchantField.ValueString}");
if (document.Fields.TryGetValue("Total", out DocumentField totalField))
Console.WriteLine($"Total: {totalField.ValueCurrency.Amount}");
if (document.Fields.TryGetValue("TransactionDate", out DocumentField dateField))
Console.WriteLine($"Date: {dateField.ValueDate}");
}
```
### 4. Build Custom Model
```csharp
var adminClient = new DocumentIntelligenceAdministrationClient(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
string modelId = "my-custom-model";
Uri blobContainerUri = new Uri("<blob-container-sas-url>");
var blobSource = new BlobContentSource(blobContainerUri);
var options = new BuildDocumentModelOptions(modelId, DocumentBuildMode.Template, blobSource);
Operation<DocumentModelDetails> operation = await adminClient.BuildDocumentModelAsync(
WaitUntil.Completed,
options);
DocumentModelDetails model = operation.Value;
Console.WriteLine($"Model ID: {model.ModelId}");
Console.WriteLine($"Created: {model.CreatedOn}");
foreach (var docType in model.DocumentTypes)
{
Console.WriteLine($"Document type: {docType.Key}");
foreach (var field in docType.Value.FieldSchema)
{
Console.WriteLine($" Field: {field.Key}, Confidence: {docType.Value.FieldConfidence[field.Key]}");
}
}
```
### 5. Build Document Classifier
```csharp
string classifierId = "my-classifier";
Uri blobContainerUri = new Uri("<blob-container-sas-url>");
var sourceA = new BlobContentSource(blobContainerUri) { Prefix = "TypeA/train" };
var sourceB = new BlobContentSource(blobContainerUri) { Prefix = "TypeB/train" };
var docTypes = new Dictionary<string, ClassifierDocumentTypeDetails>()
{
{ "TypeA", new ClassifierDocumentTypeDetails(sourceA) },
{ "TypeB", new ClassifierDocumentTypeDetails(sourceB) }
};
var options = new BuildClassifierOptions(classifierId, docTypes);
Operation<DocumentClassifierDetails> operation = await adminClient.BuildClassifierAsync(
WaitUntil.Completed,
options);
DocumentClassifierDetails classifier = operation.Value;
Console.WriteLine($"Classifier ID: {classifier.ClassifierId}");
```
### 6. Classify Document
```csharp
string classifierId = "my-classifier";
Uri documentUri = new Uri("https://example.com/document.pdf");
var options = new ClassifyDocumentOptions(classifierId, documentUri);
Operation<AnalyzeResult> operation = await client.ClassifyDocumentAsync(
WaitUntil.Completed,
options);
AnalyzeResult result = operation.Value;
foreach (AnalyzedDocument document in result.Documents)
{
Console.WriteLine($"Document type: {document.DocumentType}, confidence: {document.Confidence}");
}
```
### 7. Manage Models
```csharp
// Get resource details
DocumentIntelligenceResourceDetails resourceDetails = await adminClient.GetResourceDetailsAsync();
Console.WriteLine($"Custom models: {resourceDetails.CustomDocumentModels.Count}/{resourceDetails.CustomDocumentModels.Limit}");
// Get specific model
DocumentModelDetails model = await adminClient.GetModelAsync("my-model-id");
Console.WriteLine($"Model: {model.ModelId}, Created: {model.CreatedOn}");
// List models
await foreach (DocumentModelDetails modelItem in adminClient.GetModelsAsync())
{
Console.WriteLine($"Model: {modelItem.ModelId}");
}
// Delete model
await adminClient.DeleteModelAsync("my-model-id");
```
## Key Types Reference
| Type | Description |
|------|-------------|
| `DocumentIntelligenceClient` | Main client for analysis |
| `DocumentIntelligenceAdministrationClient` | Model management |
| `AnalyzeResult` | Result of document analysis |
| `AnalyzedDocument` | Single document within result |
| `DocumentField` | Extracted field with value and confidence |
| `DocumentFieldType` | String, Date, Number, Currency, etc. |
| `DocumentPage` | Page info (lines, words, selection marks) |
| `DocumentTable` | Extracted table with cells |
| `DocumentModelDetails` | Custom model metadata |
| `BlobContentSource` | Training data source |
## Build Modes
| Mode | Use Case |
|------|----------|
| `DocumentBuildMode.Template` | Fixed layout documents (forms) |
| `DocumentBuildMode.Neural` | Variable layout documents |
## Best Practices
1. **Use DefaultAzureCredential** for production
2. **Reuse client instances** — clients are thread-safe
3. **Handle long-running operations** — Use `WaitUntil.Completed` for simplicity
4. **Check field confidence** — Always verify `Confidence` property
5. **Use appropriate model** — Prebuilt for common docs, custom for specialized
6. **Use custom subdomain** — Required for Entra ID authentication
## Error Handling
```csharp
using Azure;
try
{
var operation = await client.AnalyzeDocumentAsync(
WaitUntil.Completed,
"prebuilt-invoice",
documentUri);
}
catch (RequestFailedException ex)
{
Console.WriteLine($"Error: {ex.Status} - {ex.Message}");
}
```
## Related SDKs
| SDK | Purpose | Install |
|-----|---------|---------|
| `Azure.AI.DocumentIntelligence` | Document analysis (this SDK) | `dotnet add package Azure.AI.DocumentIntelligence` |
| `Azure.AI.FormRecognizer` | Legacy SDK (deprecated) | Use DocumentIntelligence instead |
## Reference Links
| Resource | URL |
|----------|-----|
| NuGet Package | https://www.nuget.org/packages/Azure.AI.DocumentIntelligence |
| API Reference | https://learn.microsoft.com/dotnet/api/azure.ai.documentintelligence |
| GitHub Samples | https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/documentintelligence/Azure.AI.DocumentIntelligence/samples |
| Document Intelligence Studio | https://documentintelligence.ai.azure.com/ |
| Prebuilt Models | https://aka.ms/azsdk/formrecognizer/models |Related Skills
sdk-documentation-generator
Sdk Documentation Generator - Auto-activating skill for Technical Documentation. Triggers on: sdk documentation generator, sdk documentation generator Part of the Technical Documentation skill category.
document-merger
Document Merger - Auto-activating skill for Business Automation. Triggers on: document merger, document merger Part of the Business Automation skill category.
database-documentation-gen
Process use when you need to work with database documentation. This skill provides automated documentation generation with comprehensive guidance and automation. Trigger with phrases like "generate docs", "document schema", or "create database documentation".
code-documentation-analyzer
Code Documentation Analyzer - Auto-activating skill for Technical Documentation. Triggers on: code documentation analyzer, code documentation analyzer Part of the Technical Documentation skill category.
azure-ml-deployer
Azure Ml Deployer - Auto-activating skill for ML Deployment. Triggers on: azure ml deployer, azure ml deployer Part of the ML Deployment skill category.
document-writer
多风格文档写作技能。支持乔木、小红书、Dankoe、微信公众号、Twitter等5种写作风格。Claude 根据内容智能选择风格,按规范撰写文章。
azure-verified-modules
Azure Verified Modules (AVM) requirements and best practices for developing certified Azure Terraform modules. Use when creating or reviewing Azure modules that need AVM certification.
azure-image-builder
Build Azure managed images and Azure Compute Gallery images with Packer. Use when creating custom images for Azure VMs.
latex-document
Universal LaTeX document skill: create, compile, and convert any document to professional PDF with PNG previews. Supports resumes, reports, cover letters, invoices, academic papers, theses/dissertations, academic CVs, presentations (Beamer), scientific posters, formal letters, exams/quizzes, books, cheat sheets, reference cards, exam formula sheets, fillable PDF forms (hyperref form fields), conditional content (etoolbox toggles), mail merge from CSV/JSON (Jinja2 templates), version diffing (latexdiff), charts (pgfplots + matplotlib), tables (booktabs + CSV import), images (TikZ), Mermaid diagrams, AI-generated images, watermarks, landscape pages, bibliography/citations (BibTeX/biblatex), multi-language/CJK (auto XeLaTeX), algorithms/pseudocode, colored boxes (tcolorbox), SI units (siunitx), Pandoc format conversion (Markdown/DOCX/HTML ↔ LaTeX), and PDF-to-LaTeX conversion of handwritten or printed documents (math, business, legal, general). Compile script supports pdflatex, xelatex, lualatex with auto-detection, latexmk backend, texfot log filtering, PDF/A output, and verbosity control (--verbose/--quiet). Empirically optimized scaling: single agent 1-10 pages, split 11-20, batch-7 pipeline 21+. Use when user asks to: (1) create a resume/CV/cover letter, (2) write a LaTeX document, (3) create PDF with tables/charts/images, (4) compile a .tex file, (5) make a report/invoice/presentation, (6) anything involving LaTeX or pdflatex, (7) convert/OCR a PDF to LaTeX, (8) convert handwritten notes, (9) create charts/graphs/diagrams, (10) create slides, (11) write a thesis or dissertation, (12) create an academic CV, (13) create a poster, (14) create an exam/quiz, (15) create a book, (16) convert between document formats (Markdown, DOCX, HTML to/from LaTeX), (17) generate Mermaid diagrams for LaTeX, (18) create a formal business letter, (19) create a cheat sheet or reference card, (20) create an exam formula sheet or crib sheet, (21) condense lecture notes/PDFs into a cheat sheet, (22) create a fillable PDF form with text fields/checkboxes/dropdowns, (23) create a document with conditional content/toggles (show/hide sections), (24) generate batch/mail-merge documents from CSV/JSON data, (25) create a version diff PDF (latexdiff) highlighting changes between documents, (26) create a homework or assignment submission with problems and solutions, (27) create a lab report with data tables, graphs, and error analysis, (28) encrypt or password-protect a PDF, (29) merge multiple PDFs into one, (30) optimize/compress a PDF for web or email, (31) lint or check a LaTeX document for common issues, (32) count words in a LaTeX document, (33) analyze document statistics (figures, tables, citations), (34) fetch BibTeX from a DOI, (35) convert a Graphviz .dot file to PDF/PNG, (36) convert a PlantUML .puml file to PDF/PNG, (37) create a one-pager/fact sheet/executive summary, (38) create a datasheet or product specification sheet, (39) extract pages from a PDF (page ranges, odd/even), (40) check LaTeX package availability before compiling, (41) analyze citations and cross-reference with .bib files, (42) debug LaTeX compilation errors, (43) make a document accessible (PDF/A, tagged PDF), (44) create lecture notes or course handouts, (45) fill an existing PDF form (fillable fields or non-fillable with annotations), (46) extract text or tables from a PDF (pdfplumber, pypdf), (47) OCR a scanned PDF to text (pytesseract), (48) create a PDF programmatically with reportlab (Canvas, Platypus), (49) rotate or crop PDF pages (pypdf), (50) add a watermark to an existing PDF, (51) extract metadata from a PDF (title, author, subject).
terraform-azurerm-set-diff-analyzer
Analyze Terraform plan JSON output for AzureRM Provider to distinguish between false-positive diffs (order-only changes in Set-type attributes) and actual resource changes. Use when reviewing terraform plan output for Azure resources like Application Gateway, Load Balancer, Firewall, Front Door, NSG, and other resources with Set-type attributes that cause spurious diffs due to internal ordering changes.
oo-component-documentation
Create or update standardized object-oriented component documentation using a shared template plus mode-specific guidance for new and existing docs.
dotnet-upgrade
Ready-to-use prompts for comprehensive .NET framework upgrade analysis and execution