AI Agent Skill HUB

ClaudeText Analysis

azure-search-documents-py

Azure AI Search SDK for Python. Use for vector search, hybrid search, semantic ranking, indexing, and skillsets.

31,392 stars

Complexity: easy

View on GitHub Installation ↓

About this skill

This skill provides Python SDK access to Azure AI Search, a comprehensive cloud search service. It enables AI agents to perform sophisticated search operations, including vector search for semantic similarity, hybrid search combining keyword and vector methods, and semantic ranking for highly relevant results. Beyond search, it supports indexing documents and managing skillsets for AI enrichment pipelines directly from Python. This allows agents to process and retrieve information from large datasets with advanced capabilities like natural language understanding, OCR, and image analysis, enhancing their ability to interact with and analyze external data sources. This skill is part of the 'antigravity-awesome-skills' collection, known for enhancing AI agent capabilities with specialized tools.

Best use case

Enabling AI agents to perform advanced information retrieval from large knowledge bases, building Retrieval Augmented Generation (RAG) systems, implementing intelligent document processing workflows with AI enrichment, and managing enterprise-grade search solutions.

Azure AI Search SDK for Python. Use for vector search, hybrid search, semantic ranking, indexing, and skillsets.

Accurate and highly relevant search results based on complex queries (vector, hybrid, semantic), successful indexing of documents and data into an Azure AI Search index, efficient management and execution of AI enrichment skillsets on indexed content, and enhanced agent capabilities in information retrieval and knowledge base interaction.

Practical example

Example input

```json
{
  "command": "azure-search-documents-py.search_documents",
  "args": {
    "query": "explain quantum entanglement with simple examples",
    "index_name": "science-articles",
    "query_type": "semantic",
    "top_k": 3,
    "select_fields": ["title", "summary", "url"]
  }
}
```

Example output

```json
{
  "status": "success",
  "results": [
    {
      "id": "article-123",
      "title": "Quantum Entanglement Explained Simply",
      "summary": "Quantum entanglement is a bizarre phenomenon where two or more particles become linked and share the same fate, no matter how far apart they are. Imagine two coins that, when flipped, always show opposite sides, even if you flip them light-years apart.",
      "score": 0.98,
      "url": "https://example.com/quantum-entanglement"
    },
    {
      "id": "paper-456",
      "title": "The Spooky Action at a Distance: Entanglement's Implications",
      "summary": "Einstein famously referred to entanglement as 'spooky action at a distance' because it seemed to violate the principle that nothing can travel faster than light. Despite initial skepticism, entanglement is a proven fact and is key to quantum computing.",
      "score": 0.95,
      "url": "https://example.com/spooky-action"
    }
  ],
  "message": "Search completed successfully. Found 2 relevant documents."
}
```

When to use this skill

When your AI agent needs to query or manage data within an Azure AI Search instance, when performing complex searches that require vector similarity, hybrid approaches, or semantic understanding, when documents need AI-powered enrichment before being made searchable, or when the agent needs to access a secure, scalable search solution for enterprise data.

When not to use this skill

For simple keyword searches where a basic database query or file system search would suffice, when the data is not hosted on Azure AI Search or a similar compatible service, when the agent's task does not involve information retrieval or document indexing, or for tasks requiring only local data processing without external search capabilities.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/azure-search-documents-py/SKILL.md --create-dirs "https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/plugins/antigravity-awesome-skills-claude/skills/azure-search-documents-py/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/azure-search-documents-py/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How azure-search-documents-py Compares

Feature / Agent	azure-search-documents-py	Standard Approach
Platform Support	Claude	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	easy	N/A

Frequently Asked Questions

What does this skill do?

Azure AI Search SDK for Python. Use for vector search, hybrid search, semantic ranking, indexing, and skillsets.

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

AI Agent for Product Research

Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.

SKILL.md Source

# Azure AI Search SDK for Python

Full-text, vector, and hybrid search with AI enrichment capabilities.

## Installation

```bash
pip install azure-search-documents
```

## Environment Variables

```bash
AZURE_SEARCH_ENDPOINT=https://<service-name>.search.windows.net
AZURE_SEARCH_API_KEY=<your-api-key>
AZURE_SEARCH_INDEX_NAME=<your-index-name>
```

## Authentication

### API Key

```python
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential

client = SearchClient(
    endpoint=os.environ["AZURE_SEARCH_ENDPOINT"],
    index_name=os.environ["AZURE_SEARCH_INDEX_NAME"],
    credential=AzureKeyCredential(os.environ["AZURE_SEARCH_API_KEY"])
)
```

### Entra ID (Recommended)

```python
from azure.search.documents import SearchClient
from azure.identity import DefaultAzureCredential

client = SearchClient(
    endpoint=os.environ["AZURE_SEARCH_ENDPOINT"],
    index_name=os.environ["AZURE_SEARCH_INDEX_NAME"],
    credential=DefaultAzureCredential()
)
```

## Client Types

| Client | Purpose |
|--------|---------|
| `SearchClient` | Search and document operations |
| `SearchIndexClient` | Index management, synonym maps |
| `SearchIndexerClient` | Indexers, data sources, skillsets |

## Create Index with Vector Field

```python
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchIndex,
    SearchField,
    SearchFieldDataType,
    VectorSearch,
    HnswAlgorithmConfiguration,
    VectorSearchProfile,
    SearchableField,
    SimpleField
)

index_client = SearchIndexClient(endpoint, AzureKeyCredential(key))

fields = [
    SimpleField(name="id", type=SearchFieldDataType.String, key=True),
    SearchableField(name="title", type=SearchFieldDataType.String),
    SearchableField(name="content", type=SearchFieldDataType.String),
    SearchField(
        name="content_vector",
        type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
        searchable=True,
        vector_search_dimensions=1536,
        vector_search_profile_name="my-vector-profile"
    )
]

vector_search = VectorSearch(
    algorithms=[
        HnswAlgorithmConfiguration(name="my-hnsw")
    ],
    profiles=[
        VectorSearchProfile(
            name="my-vector-profile",
            algorithm_configuration_name="my-hnsw"
        )
    ]
)

index = SearchIndex(
    name="my-index",
    fields=fields,
    vector_search=vector_search
)

index_client.create_or_update_index(index)
```

## Upload Documents

```python
from azure.search.documents import SearchClient

client = SearchClient(endpoint, "my-index", AzureKeyCredential(key))

documents = [
    {
        "id": "1",
        "title": "Azure AI Search",
        "content": "Full-text and vector search service",
        "content_vector": [0.1, 0.2, ...]  # 1536 dimensions
    }
]

result = client.upload_documents(documents)
print(f"Uploaded {len(result)} documents")
```

## Keyword Search

```python
results = client.search(
    search_text="azure search",
    select=["id", "title", "content"],
    top=10
)

for result in results:
    print(f"{result['title']}: {result['@search.score']}")
```

## Vector Search

```python
from azure.search.documents.models import VectorizedQuery

# Your query embedding (1536 dimensions)
query_vector = get_embedding("semantic search capabilities")

vector_query = VectorizedQuery(
    vector=query_vector,
    k_nearest_neighbors=10,
    fields="content_vector"
)

results = client.search(
    vector_queries=[vector_query],
    select=["id", "title", "content"]
)

for result in results:
    print(f"{result['title']}: {result['@search.score']}")
```

## Hybrid Search (Vector + Keyword)

```python
from azure.search.documents.models import VectorizedQuery

vector_query = VectorizedQuery(
    vector=query_vector,
    k_nearest_neighbors=10,
    fields="content_vector"
)

results = client.search(
    search_text="azure search",
    vector_queries=[vector_query],
    select=["id", "title", "content"],
    top=10
)
```

## Semantic Ranking

```python
from azure.search.documents.models import QueryType

results = client.search(
    search_text="what is azure search",
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name="my-semantic-config",
    select=["id", "title", "content"],
    top=10
)

for result in results:
    print(f"{result['title']}")
    if result.get("@search.captions"):
        print(f"  Caption: {result['@search.captions'][0].text}")
```

## Filters

```python
results = client.search(
    search_text="*",
    filter="category eq 'Technology' and rating gt 4",
    order_by=["rating desc"],
    select=["id", "title", "category", "rating"]
)
```

## Facets

```python
results = client.search(
    search_text="*",
    facets=["category,count:10", "rating"],
    top=0  # Only get facets, no documents
)

for facet_name, facet_values in results.get_facets().items():
    print(f"{facet_name}:")
    for facet in facet_values:
        print(f"  {facet['value']}: {facet['count']}")
```

## Autocomplete & Suggest

```python
# Autocomplete
results = client.autocomplete(
    search_text="sea",
    suggester_name="my-suggester",
    mode="twoTerms"
)

# Suggest
results = client.suggest(
    search_text="sea",
    suggester_name="my-suggester",
    select=["title"]
)
```

## Indexer with Skillset

```python
from azure.search.documents.indexes import SearchIndexerClient
from azure.search.documents.indexes.models import (
    SearchIndexer,
    SearchIndexerDataSourceConnection,
    SearchIndexerSkillset,
    EntityRecognitionSkill,
    InputFieldMappingEntry,
    OutputFieldMappingEntry
)

indexer_client = SearchIndexerClient(endpoint, AzureKeyCredential(key))

# Create data source
data_source = SearchIndexerDataSourceConnection(
    name="my-datasource",
    type="azureblob",
    connection_string=connection_string,
    container={"name": "documents"}
)
indexer_client.create_or_update_data_source_connection(data_source)

# Create skillset
skillset = SearchIndexerSkillset(
    name="my-skillset",
    skills=[
        EntityRecognitionSkill(
            inputs=[InputFieldMappingEntry(name="text", source="/document/content")],
            outputs=[OutputFieldMappingEntry(name="organizations", target_name="organizations")]
        )
    ]
)
indexer_client.create_or_update_skillset(skillset)

# Create indexer
indexer = SearchIndexer(
    name="my-indexer",
    data_source_name="my-datasource",
    target_index_name="my-index",
    skillset_name="my-skillset"
)
indexer_client.create_or_update_indexer(indexer)
```

## Best Practices

1. **Use hybrid search** for best relevance combining vector and keyword
2. **Enable semantic ranking** for natural language queries
3. **Index in batches** of 100-1000 documents for efficiency
4. **Use filters** to narrow results before ranking
5. **Configure vector dimensions** to match your embedding model
6. **Use HNSW algorithm** for large-scale vector search
7. **Create suggesters** at index creation time (cannot add later)

## Reference Files

| File | Contents |
|------|----------|
| references/vector-search.md | HNSW configuration, integrated vectorization, multi-vector queries |
| references/semantic-ranking.md | Semantic configuration, captions, answers, hybrid patterns |
| scripts/setup_vector_index.py | CLI script to create vector-enabled search index |


---

## Additional Azure AI Search Patterns

# Azure AI Search Python SDK

Write clean, idiomatic Python code for Azure AI Search using `azure-search-documents`.

## Installation

```bash
pip install azure-search-documents azure-identity
```

## Environment Variables

```bash
AZURE_SEARCH_ENDPOINT=https://<search-service>.search.windows.net
AZURE_SEARCH_INDEX_NAME=<index-name>
# For API key auth (not recommended for production)
AZURE_SEARCH_API_KEY=<api-key>
```

## Authentication

**DefaultAzureCredential (preferred)**:
```python
from azure.identity import DefaultAzureCredential
from azure.search.documents import SearchClient

credential = DefaultAzureCredential()
client = SearchClient(endpoint, index_name, credential)
```

**API Key**:
```python
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient

client = SearchClient(endpoint, index_name, AzureKeyCredential(api_key))
```

## Client Selection

| Client | Purpose |
|--------|---------|
| `SearchClient` | Query indexes, upload/update/delete documents |
| `SearchIndexClient` | Create/manage indexes, knowledge sources, knowledge bases |
| `SearchIndexerClient` | Manage indexers, skillsets, data sources |
| `KnowledgeBaseRetrievalClient` | Agentic retrieval with LLM-powered Q&A |

## Index Creation Pattern

```python
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchIndex, SearchField, VectorSearch, VectorSearchProfile,
    HnswAlgorithmConfiguration, AzureOpenAIVectorizer,
    AzureOpenAIVectorizerParameters, SemanticSearch,
    SemanticConfiguration, SemanticPrioritizedFields, SemanticField
)

index = SearchIndex(
    name=index_name,
    fields=[
        SearchField(name="id", type="Edm.String", key=True),
        SearchField(name="content", type="Edm.String", searchable=True),
        SearchField(name="embedding", type="Collection(Edm.Single)",
                   vector_search_dimensions=3072,
                   vector_search_profile_name="vector-profile"),
    ],
    vector_search=VectorSearch(
        profiles=[VectorSearchProfile(
            name="vector-profile",
            algorithm_configuration_name="hnsw-algo",
            vectorizer_name="openai-vectorizer"
        )],
        algorithms=[HnswAlgorithmConfiguration(name="hnsw-algo")],
        vectorizers=[AzureOpenAIVectorizer(
            vectorizer_name="openai-vectorizer",
            parameters=AzureOpenAIVectorizerParameters(
                resource_url=aoai_endpoint,
                deployment_name=embedding_deployment,
                model_name=embedding_model
            )
        )]
    ),
    semantic_search=SemanticSearch(
        default_configuration_name="semantic-config",
        configurations=[SemanticConfiguration(
            name="semantic-config",
            prioritized_fields=SemanticPrioritizedFields(
                content_fields=[SemanticField(field_name="content")]
            )
        )]
    )
)

index_client = SearchIndexClient(endpoint, credential)
index_client.create_or_update_index(index)
```

## Document Operations

```python
from azure.search.documents import SearchIndexingBufferedSender

# Batch upload with automatic batching
with SearchIndexingBufferedSender(endpoint, index_name, credential) as sender:
    sender.upload_documents(documents)

# Direct operations via SearchClient
search_client = SearchClient(endpoint, index_name, credential)
search_client.upload_documents(documents)      # Add new
search_client.merge_documents(documents)       # Update existing
search_client.merge_or_upload_documents(documents)  # Upsert
search_client.delete_documents(documents)      # Remove
```

## Search Patterns

```python
# Basic search
results = search_client.search(search_text="query")

# Vector search
from azure.search.documents.models import VectorizedQuery

results = search_client.search(
    search_text=None,
    vector_queries=[VectorizedQuery(
        vector=embedding,
        k_nearest_neighbors=5,
        fields="embedding"
    )]
)

# Hybrid search (vector + keyword)
results = search_client.search(
    search_text="query",
    vector_queries=[VectorizedQuery(vector=embedding, k_nearest_neighbors=5, fields="embedding")],
    query_type="semantic",
    semantic_configuration_name="semantic-config"
)

# With filters
results = search_client.search(
    search_text="query",
    filter="category eq 'technology'",
    select=["id", "title", "content"],
    top=10
)
```

## Agentic Retrieval (Knowledge Bases)

For LLM-powered Q&A with answer synthesis, see references/agentic-retrieval.md.

Key concepts:
- **Knowledge Source**: Points to a search index
- **Knowledge Base**: Wraps knowledge sources + LLM for query planning and synthesis
- **Output modes**: `EXTRACTIVE_DATA` (raw chunks) or `ANSWER_SYNTHESIS` (LLM-generated answers)

## Async Pattern

```python
from azure.search.documents.aio import SearchClient

async with SearchClient(endpoint, index_name, credential) as client:
    results = await client.search(search_text="query")
    async for result in results:
        print(result["title"])
```

## Best Practices

1. **Use environment variables** for endpoints, keys, and deployment names
2. **Prefer `DefaultAzureCredential`** over API keys for production
3. **Use `SearchIndexingBufferedSender`** for batch uploads (handles batching/retries)
4. **Always define semantic configuration** for agentic retrieval indexes
5. **Use `create_or_update_index`** for idempotent index creation
6. **Close clients** with context managers or explicit `close()`

## Field Types Reference

| EDM Type | Python | Notes |
|----------|--------|-------|
| `Edm.String` | str | Searchable text |
| `Edm.Int32` | int | Integer |
| `Edm.Int64` | int | Long integer |
| `Edm.Double` | float | Floating point |
| `Edm.Boolean` | bool | True/False |
| `Edm.DateTimeOffset` | datetime | ISO 8601 |
| `Collection(Edm.Single)` | List[float] | Vector embeddings |
| `Collection(Edm.String)` | List[str] | String arrays |

## Error Handling

```python
from azure.core.exceptions import (
    HttpResponseError,
    ResourceNotFoundError,
    ResourceExistsError
)

try:
    result = search_client.get_document(key="123")
except ResourceNotFoundError:
    print("Document not found")
except HttpResponseError as e:
    print(f"Search error: {e.message}")
```

## When to Use
This skill is applicable to execute the workflow or actions described in the overview.

Related Skills

azure-ai-textanalytics-py

from sickn33/antigravity-awesome-skills

Azure AI Text Analytics SDK for sentiment analysis, entity recognition, key phrases, language detection, PII, and healthcare NLP. Use for natural language processing on text.

Text AnalysisClaude

keyword-extractor

from sickn33/antigravity-awesome-skills

Extracts up to 50 highly relevant SEO keywords from text. Use when user wants to generate or extract keywords for given text.

Text AnalysisClaude

hugging-face-papers

from sickn33/antigravity-awesome-skills

Read and analyze Hugging Face paper pages or arXiv papers with markdown and papers API metadata.

Text AnalysisClaude

flutter-expert

from sickn33/antigravity-awesome-skills

Master Flutter development with Dart 3, advanced widgets, and multi-platform deployment.

Text AnalysisClaude

docs-architect

from sickn33/antigravity-awesome-skills

Creates comprehensive technical documentation from existing codebases. Analyzes architecture, design patterns, and implementation details to produce long-form technical manuals and ebooks.

Text AnalysisClaude

data-storytelling

from sickn33/antigravity-awesome-skills

Transform raw data into compelling narratives that drive decisions and inspire action.

Text AnalysisClaude

data-engineering-data-pipeline

from sickn33/antigravity-awesome-skills

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

Text AnalysisClaude

behavioral-modes

from sickn33/antigravity-awesome-skills

AI operational modes (brainstorm, implement, debug, review, teach, ship, orchestrate). Use to adapt behavior based on task type.

Text AnalysisClaude

microsoft-azure-webjobs-extensions-authentication-events-dotnet

from sickn33/antigravity-awesome-skills

Microsoft Entra Authentication Events SDK for .NET. Azure Functions triggers for custom authentication extensions.

Identity Management / Authentication & AuthorizationClaude

hybrid-search-implementation

from sickn33/antigravity-awesome-skills

Combine vector and keyword search for improved retrieval. Use when implementing RAG systems, building search engines, or when neither approach alone provides sufficient recall.

Information RetrievalClaude

exa-search

from sickn33/antigravity-awesome-skills

Semantic search, similar content discovery, and structured research using Exa API. Use when you need semantic/embeddings-based search, finding similar content, or searching by category (company, people, research papers, etc.).

Information RetrievalClaude

deep-research

from sickn33/antigravity-awesome-skills

Run autonomous research tasks that plan, search, read, and synthesize information into comprehensive reports.

ResearchClaudeGemini