bio-spatial-transcriptomics-spatial-data-io

Load spatial transcriptomics data from Visium, Xenium, MERFISH, Slide-seq, and other platforms using Squidpy and SpatialData. Read Space Ranger outputs, convert formats, and access spatial coordinates. Use when loading Visium, Xenium, MERFISH, or other spatial data.

1,802 stars

byFreedomIntelligence

View on GitHub Installation ↓

Best use case

bio-spatial-transcriptomics-spatial-data-io is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using bio-spatial-transcriptomics-spatial-data-io should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-spatial-transcriptomics-spatial-data-io/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/bio-spatial-transcriptomics-spatial-data-io/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/bio-spatial-transcriptomics-spatial-data-io/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How bio-spatial-transcriptomics-spatial-data-io Compares

Feature / Agent	bio-spatial-transcriptomics-spatial-data-io	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

SKILL.md Source

## Version Compatibility

Reference examples tested with: anndata 0.10+, numpy 1.26+, pandas 2.2+, scanpy 1.10+, spatialdata 0.1+, squidpy 1.3+

Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Spatial Data I/O

**"Load my Visium spatial data"** → Read spatial transcriptomics outputs (Visium, Xenium, MERFISH, Slide-seq) into AnnData objects with spatial coordinates and tissue images.
- Python: `squidpy.read.visium('spaceranger_out/')`, `spatialdata.read_zarr()`

Load and work with spatial transcriptomics data from various platforms.

## Required Imports

```python
import squidpy as sq
import scanpy as sc
import anndata as ad
import spatialdata as sd
import spatialdata_io as sdio
```

## Load 10X Visium Data

**Goal:** Load Visium spatial transcriptomics data from Space Ranger output into an AnnData object.

**Approach:** Use Squidpy's `read.visium` to parse the output directory, which loads expression, spatial coordinates, and tissue images.

```python
# Load Space Ranger output (standard method)
adata = sq.read.visium('path/to/spaceranger/output/')
print(f'Loaded {adata.n_obs} spots, {adata.n_vars} genes')

# Spatial coordinates are in adata.obsm['spatial']
print(f"Spatial coords shape: {adata.obsm['spatial'].shape}")

# Image is in adata.uns['spatial']
library_id = list(adata.uns['spatial'].keys())[0]
print(f'Library ID: {library_id}')
```

## Load Visium with Scanpy

**Goal:** Load Visium data using Scanpy's built-in reader as an alternative to Squidpy.

**Approach:** Use `sc.read_visium` to parse Space Ranger output, then access images and scale factors from `adata.uns['spatial']`.

```python
# Alternative using Scanpy directly
adata = sc.read_visium('path/to/spaceranger/output/')

# Access tissue image
img = adata.uns['spatial'][library_id]['images']['hires']
scale_factor = adata.uns['spatial'][library_id]['scalefactors']['tissue_hires_scalef']
```

## Load 10X Xenium Data

**Goal:** Load single-cell resolution Xenium spatial data.

**Approach:** Use Squidpy's `read.xenium` to parse Xenium output, yielding per-cell expression and coordinates.

```python
# Load Xenium output
adata = sq.read.xenium('path/to/xenium/output/')
print(f'Loaded {adata.n_obs} cells')

# Xenium has single-cell resolution
print(f"Cell coordinates: {adata.obsm['spatial'].shape}")
```

## Load with SpatialData (Recommended for New Projects)

**Goal:** Load spatial data into SpatialData objects for unified multi-modal representation.

**Approach:** Use spatialdata-io readers per platform, which organize expression, shapes, and images into a single object.

```python
import spatialdata_io as sdio

# Load Visium as SpatialData object
sdata = sdio.visium('path/to/spaceranger/output/')
print(sdata)

# Load Xenium
sdata = sdio.xenium('path/to/xenium/output/')

# Access components
table = sdata.tables['table']  # AnnData with expression
shapes = sdata.shapes  # Spatial shapes (spots, cells)
images = sdata.images  # Tissue images
```

## Load MERFISH Data

**Goal:** Load MERFISH (Vizgen MERSCOPE) spatial data.

**Approach:** Use spatialdata-io or Squidpy readers to parse MERSCOPE output with cell-by-gene counts and metadata.

```python
# MERFISH (Vizgen MERSCOPE)
sdata = sdio.merscope('path/to/merscope/output/')

# Or as AnnData
adata = sq.read.vizgen('path/to/vizgen/output/', counts_file='cell_by_gene.csv', meta_file='cell_metadata.csv')
```

## Load Slide-seq Data

```python
# Slide-seq / Slide-seqV2
adata = sq.read.slideseq('beads.csv', coordinates_file='coords.csv')
```

## Load Nanostring CosMx

```python
# CosMx spatial molecular imaging
sdata = sdio.cosmx('path/to/cosmx/output/')
```

## Load Stereo-seq Data

```python
# Stereo-seq (BGI)
sdata = sdio.stereoseq('path/to/stereoseq/output/')
```

## Load from H5AD with Spatial Coordinates

```python
# If you have h5ad with spatial already stored
adata = sc.read_h5ad('spatial_data.h5ad')

# Verify spatial data exists
if 'spatial' in adata.obsm:
    print('Has spatial coordinates')
if 'spatial' in adata.uns:
    print('Has image data')
```

## Create Spatial AnnData from Scratch

**Goal:** Construct a spatial AnnData object from raw expression and coordinate arrays.

**Approach:** Build an AnnData with spatial coordinates in `obsm['spatial']` and minimal metadata in `uns['spatial']` for Squidpy compatibility.

```python
import numpy as np
import pandas as pd

# Expression matrix
X = np.random.poisson(5, size=(1000, 500))

# Spatial coordinates
spatial_coords = np.random.rand(1000, 2) * 1000  # x, y in pixels

# Create AnnData
adata = ad.AnnData(X)
adata.obs_names = [f'spot_{i}' for i in range(1000)]
adata.var_names = [f'gene_{i}' for i in range(500)]
adata.obsm['spatial'] = spatial_coords

# Add minimal spatial metadata for Squidpy
adata.uns['spatial'] = {
    'library_id': {
        'scalefactors': {'tissue_hires_scalef': 1.0, 'spot_diameter_fullres': 50},
    }
}
```

## Access Spatial Coordinates

```python
# Get coordinates as numpy array
coords = adata.obsm['spatial']
x_coords = coords[:, 0]
y_coords = coords[:, 1]

# Get coordinates as DataFrame
coord_df = pd.DataFrame(adata.obsm['spatial'], index=adata.obs_names, columns=['x', 'y'])
```

## Access Tissue Images

```python
# Get high-resolution image
library_id = list(adata.uns['spatial'].keys())[0]
hires_img = adata.uns['spatial'][library_id]['images']['hires']
lowres_img = adata.uns['spatial'][library_id]['images']['lowres']

# Scale factors
scalef = adata.uns['spatial'][library_id]['scalefactors']
print(f"Hires scale: {scalef['tissue_hires_scalef']}")
print(f"Spot diameter: {scalef['spot_diameter_fullres']}")
```

## Convert Between Formats

**Goal:** Convert spatial data between SpatialData and AnnData representations.

**Approach:** Extract tables and coordinate arrays from SpatialData, then save as h5ad or zarr.

```python
# SpatialData to AnnData
sdata = sdio.visium('path/to/data/')
adata = sdata.tables['table'].copy()
adata.obsm['spatial'] = np.array(sdata.shapes['spots'][['x', 'y']])

# Save as h5ad
adata.write_h5ad('spatial_converted.h5ad')

# Save SpatialData
sdata.write('spatial_data.zarr')
```

## Load Multiple Samples

**Goal:** Load and merge spatial data from multiple Visium samples into a single AnnData.

**Approach:** Iterate over sample directories, tag each with a sample label, then concatenate with `ad.concat`.

```python
# Load and concatenate multiple Visium samples
samples = ['sample1', 'sample2', 'sample3']
adatas = []

for sample in samples:
    adata = sq.read.visium(f'data/{sample}/')
    adata.obs['sample'] = sample
    adatas.append(adata)

# Concatenate
adata_combined = ad.concat(adatas, label='sample', keys=samples)
print(f'Combined: {adata_combined.n_obs} spots')
```

## Subset by Spatial Region

**Goal:** Extract spots within a rectangular spatial region of interest.

**Approach:** Apply coordinate-based boolean masking on `obsm['spatial']` to filter spots by x/y bounds.

```python
# Select spots in a rectangular region
x_min, x_max = 1000, 2000
y_min, y_max = 1500, 2500

coords = adata.obsm['spatial']
in_region = (coords[:, 0] >= x_min) & (coords[:, 0] <= x_max) & (coords[:, 1] >= y_min) & (coords[:, 1] <= y_max)

adata_region = adata[in_region].copy()
print(f'Selected {adata_region.n_obs} spots')
```

## Related Skills

- spatial-preprocessing - QC and normalization after loading
- spatial-visualization - Plot spatial data
- single-cell/data-io - Non-spatial scRNA-seq data loading

Related Skills

zinc-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

uspto-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Access USPTO APIs for patent/trademark searches, examination history (PEDS), assignments, citations, office actions, TSDR, for IP analysis and prior art searches.

uniprot-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Direct REST API access to UniProt. Protein searches, FASTA retrieval, ID mapping, Swiss-Prot/TrEMBL. For Python workflows with multiple databases, prefer bioservices (unified interface to 40+ services). Use this for direct HTTP/REST work or UniProt-specific control.

tooluniverse-spatial-transcriptomics

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Analyze spatial transcriptomics data to map gene expression in tissue architecture. Supports 10x Visium, MERFISH, seqFISH, Slide-seq, and imaging-based platforms. Performs spatial clustering, domain identification, cell-cell proximity analysis, spatial gene expression patterns, tissue architecture mapping, and integration with single-cell data. Use when analyzing spatial transcriptomics datasets, studying tissue organization, identifying spatial expression patterns, mapping cell-cell interactions in tissue context, characterizing tumor microenvironment spatial structure, or integrating spatial and single-cell RNA-seq data for comprehensive tissue analysis.

tooluniverse-spatial-omics-analysis

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Computational analysis framework for spatial multi-omics data integration. Given spatially variable genes (SVGs), spatial domain annotations, tissue type, and disease context from spatial transcriptomics/proteomics experiments (10x Visium, MERFISH, DBiTplus, SLIDE-seq, etc.), performs comprehensive biological interpretation including pathway enrichment, cell-cell interaction inference, druggable target identification, immune microenvironment characterization, and multi-modal integration. Produces a detailed markdown report with Spatial Omics Integration Score (0-100), domain-by-domain characterization, and validation recommendations. Uses 70+ ToolUniverse tools across 9 analysis phases. Use when users ask about spatial transcriptomics analysis, spatial omics interpretation, tissue heterogeneity, spatial gene expression patterns, tumor microenvironment mapping, tissue zonation, or cell-cell communication from spatial data.

tooluniverse-expression-data-retrieval

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Retrieves gene expression and omics datasets from ArrayExpress and BioStudies with gene disambiguation, experiment quality assessment, and structured reports. Creates comprehensive dataset profiles with metadata, sample information, and download links. Use when users need expression data, omics datasets, or mention ArrayExpress (E-MTAB, E-GEOD) or BioStudies (S-BSST) accessions.

tcga-bulk-data-preprocessing-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Guide Claude through ingesting TCGA sample sheets, expression archives, and clinical carts into omicverse, initialising survival metadata, and exporting annotated AnnData files.

string-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.

spatial-transcriptomics-tutorials-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Guide users through omicverse's spatial transcriptomics tutorials covering preprocessing, deconvolution, and downstream modelling workflows across Visium, Visium HD, Stereo-seq, and Slide-seq datasets.

single2spatial-spatial-mapping

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Map scRNA-seq atlases onto spatial transcriptomics slides using omicverse's Single2Spatial workflow for deep-forest training, spot-level assessment, and marker visualisation.

reactome-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Query Reactome REST API for pathway analysis, enrichment, gene-pathway mapping, disease pathways, molecular interactions, expression analysis, for systems biology studies.

pubmed-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Direct REST API access to PubMed. Advanced Boolean/MeSH queries, E-utilities API, batch processing, citation management. For Python workflows, prefer biopython (Bio.Entrez). Use this for direct HTTP/REST work or custom API implementations.