bio-single-cell-cell-annotation

Automated cell type annotation using reference-based methods including CellTypist, scPred, SingleR, and Azimuth for consistent, reproducible cell labeling.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

bio-single-cell-cell-annotation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Automated cell type annotation using reference-based methods including CellTypist, scPred, SingleR, and Azimuth for consistent, reproducible cell labeling.

Teams using bio-single-cell-cell-annotation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-single-cell-cell-annotation/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/bio-single-cell-cell-annotation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/bio-single-cell-cell-annotation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How bio-single-cell-cell-annotation Compares

Feature / Agent	bio-single-cell-cell-annotation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Automated cell type annotation using reference-based methods including CellTypist, scPred, SingleR, and Azimuth for consistent, reproducible cell labeling.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Automated Cell Type Annotation

## CellTypist (Python)

```python
import celltypist
import scanpy as sc

adata = sc.read_h5ad('adata_processed.h5ad')

# List available models
celltypist.models.models_description()

# Download model
celltypist.models.download_models(model='Immune_All_Low.pkl')

# Load model
model = celltypist.models.Model.load(model='Immune_All_Low.pkl')

# Predict cell types
predictions = celltypist.annotate(adata, model=model, majority_voting=True)

# Add predictions to adata
adata = predictions.to_adata()

# Access predictions
adata.obs['cell_type_celltypist'] = adata.obs['majority_voting']
adata.obs['cell_type_confidence'] = adata.obs['conf_score']

# Visualize
sc.pl.umap(adata, color=['cell_type_celltypist', 'conf_score'])
```

## CellTypist with Custom Model

```python
# Train custom model
new_model = celltypist.train(adata_reference, labels='cell_type', n_jobs=10,
                              feature_selection=True, use_SGD=True)

# Save model
new_model.write('custom_model.pkl')

# Use custom model
predictions = celltypist.annotate(adata_query, model='custom_model.pkl')
```

## SingleR (R)

```r
library(SingleR)
library(celldex)
library(Seurat)
library(SingleCellExperiment)

seurat_obj <- readRDS('seurat_processed.rds')
sce <- as.SingleCellExperiment(seurat_obj)

# Load reference (multiple available)
ref <- celldex::HumanPrimaryCellAtlasData()
# Other options:
# ref <- celldex::BlueprintEncodeData()
# ref <- celldex::MonacoImmuneData()
# ref <- celldex::ImmGenData()  # mouse

# Run SingleR
pred <- SingleR(test = sce, ref = ref, labels = ref$label.main, de.method = 'wilcox')

# Add to Seurat
seurat_obj$SingleR_labels <- pred$labels
seurat_obj$SingleR_pruned <- pred$pruned.labels

# Check annotation quality
plotScoreHeatmap(pred)
plotDeltaDistribution(pred)
```

## SingleR Fine Labels

```r
# Use fine-grained labels
pred_fine <- SingleR(test = sce, ref = ref, labels = ref$label.fine)

# Combine multiple references
ref1 <- celldex::BlueprintEncodeData()
ref2 <- celldex::MonacoImmuneData()
pred_combined <- SingleR(test = sce, ref = list(BP = ref1, Monaco = ref2),
                          labels = list(ref1$label.main, ref2$label.main))
```

## Azimuth (R/Seurat)

```r
library(Seurat)
library(Azimuth)

seurat_obj <- readRDS('seurat_processed.rds')

# Run Azimuth with PBMC reference
seurat_obj <- RunAzimuth(seurat_obj, reference = 'pbmcref')

# Available references: pbmcref, bonemarrowref, lungref, etc.

# Access predictions
seurat_obj$azimuth_labels <- seurat_obj$predicted.celltype.l2
seurat_obj$azimuth_score <- seurat_obj$predicted.celltype.l2.score

# Visualize
DimPlot(seurat_obj, group.by = 'azimuth_labels', label = TRUE) + NoLegend()
FeaturePlot(seurat_obj, features = 'predicted.celltype.l2.score')
```

## scPred (R)

```r
library(scPred)
library(Seurat)

# Train on reference
reference <- readRDS('reference_seurat.rds')
reference <- getFeatureSpace(reference, 'cell_type')
reference <- trainModel(reference)

# Get training probabilities
get_probabilities(reference)
get_scpred(reference)

# Plot model performance
plot_probabilities(reference)

# Predict on query
query <- readRDS('query_seurat.rds')
query <- scPredict(query, reference)

# Results
query$scpred_prediction
query$scpred_max
```

## Annotation Confidence Filtering

```python
# CellTypist: filter low confidence
high_conf = adata[adata.obs['conf_score'] > 0.5].copy()

# Flag uncertain cells
adata.obs['annotation_uncertain'] = adata.obs['conf_score'] < 0.3
```

```r
# SingleR: use pruned labels (low-quality removed)
seurat_obj$final_labels <- ifelse(is.na(pred$pruned.labels), 'Unknown', pred$labels)

# Azimuth: filter by score
seurat_obj$high_conf_labels <- ifelse(seurat_obj$predicted.celltype.l2.score > 0.7,
                                       seurat_obj$predicted.celltype.l2, 'Low_confidence')
```

## Consensus Annotation

```r
# Combine multiple methods
annotations <- data.frame(
    SingleR = seurat_obj$SingleR_labels,
    Azimuth = seurat_obj$azimuth_labels,
    CellTypist = seurat_obj$celltypist_labels
)

# Majority vote
get_consensus <- function(x) {
    tbl <- table(x)
    if (max(tbl) >= 2) names(which.max(tbl)) else 'Ambiguous'
}
seurat_obj$consensus_label <- apply(annotations, 1, get_consensus)
```

## Compare Annotations

```python
import pandas as pd
from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score

# Compare two annotations
ari = adjusted_rand_score(adata.obs['manual_annotation'], adata.obs['celltypist'])
nmi = normalized_mutual_info_score(adata.obs['manual_annotation'], adata.obs['celltypist'])

# Confusion matrix
pd.crosstab(adata.obs['manual_annotation'], adata.obs['celltypist'])
```

## Marker-Based Validation

```r
# Validate predictions with known markers
canonical_markers <- list(
    T_cell = c('CD3D', 'CD3E', 'CD4', 'CD8A'),
    B_cell = c('CD19', 'MS4A1', 'CD79A'),
    Monocyte = c('CD14', 'LYZ', 'S100A8'),
    NK = c('NKG7', 'GNLY', 'NCAM1')
)

# Check marker expression per predicted type
DotPlot(seurat_obj, features = unlist(canonical_markers), group.by = 'predicted_labels') +
    RotatedAxis()
```

## Related Skills

- single-cell/clustering-annotation - Manual marker-based annotation
- single-cell/cell-communication - Use annotated types for CCC
- single-cell/trajectory-inference - Trajectory on annotated data

Related Skills

ai-annotation-workflow

from diegosouzapw/awesome-omni-skill

Эксперт по data annotation. Используй для ML labeling, annotation workflows и quality control.

annotations

from diegosouzapw/awesome-omni-skill

Workflow for adding type annotations to Plain packages. Use this when adding or improving type coverage.

universal-single-cell-annotator

from diegosouzapw/awesome-omni-skill

A unified interface for annotating single-cell RNA-seq data using Marker Genes, Deep Learning (CellTypist), or LLMs.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

mcp-create-declarative-agent

from diegosouzapw/awesome-omni-skill

Skill converted from mcp-create-declarative-agent.prompt.md

MCP Architecture Expert

from diegosouzapw/awesome-omni-skill

Design and implement Model Context Protocol servers for standardized AI-to-data integration with resources, tools, prompts, and security best practices

mathem-shopping

from diegosouzapw/awesome-omni-skill

Automatiserar att logga in på Mathem.se, söka och lägga till varor från en lista eller recept, hantera ersättningar enligt policy och reservera leveranstid, men lämnar varukorgen redo för manuell checkout.

math-modeling

from diegosouzapw/awesome-omni-skill

本技能应在用户要求"数学建模"、"建模比赛"、"数模论文"、"数学建模竞赛"、"建模分析"、"建模求解"或提及数学建模相关任务时使用。适用于全国大学生数学建模竞赛(CUMCM)、美国大学生数学建模竞赛(MCM/ICM)等各类数学建模比赛。

matchms

from diegosouzapw/awesome-omni-skill

Mass spectrometry analysis. Process mzML/MGF/MSP, spectral similarity (cosine, modified cosine), metadata harmonization, compound ID, for metabolomics and MS data processing.

managing-traefik

from diegosouzapw/awesome-omni-skill

Manages Traefik reverse proxy for local development. Use when routing domains to local services, configuring CORS, checking service health, or debugging connectivity issues.

managing-skills

from diegosouzapw/awesome-omni-skill

Install, find, update, and manage agent skills. Use when the user wants to add a new skill, search for skills that do something, check if skills are up to date, or update existing skills. Triggers on: install skill, add skill, get skill, find skill, search skill, update skill, check skills, list skills.

manage-agents

from diegosouzapw/awesome-omni-skill

Create, modify, and manage Claude Code subagents with specialized expertise. Use when you need to "work with agents", "create an agent", "modify an agent", "set up a specialist", "I need an agent for [task]", or "agent to handle [domain]". Covers agent file format, YAML frontmatter, system prompts, tool restrictions, MCP integration, model selection, and testing.