bio-single-cell-splicing
Analyzes alternative splicing at single-cell resolution using BRIE2 for probabilistic PSI estimation or leafcutter2 for cluster-based analysis with NMD detection. Identifies cell-type-specific splicing patterns. Use when analyzing isoform usage in scRNA-seq or finding splicing differences between cell populations.
Best use case
bio-single-cell-splicing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyzes alternative splicing at single-cell resolution using BRIE2 for probabilistic PSI estimation or leafcutter2 for cluster-based analysis with NMD detection. Identifies cell-type-specific splicing patterns. Use when analyzing isoform usage in scRNA-seq or finding splicing differences between cell populations.
Teams using bio-single-cell-splicing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/bio-single-cell-splicing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How bio-single-cell-splicing Compares
| Feature / Agent | bio-single-cell-splicing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyzes alternative splicing at single-cell resolution using BRIE2 for probabilistic PSI estimation or leafcutter2 for cluster-based analysis with NMD detection. Identifies cell-type-specific splicing patterns. Use when analyzing isoform usage in scRNA-seq or finding splicing differences between cell populations.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
## Version Compatibility
Reference examples tested with: anndata 0.10+, numpy 1.26+, pandas 2.2+, scanpy 1.10+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Single-Cell Splicing Analysis
Analyze alternative splicing at single-cell resolution.
## Tool Selection
| Tool | Approach | Strengths |
|------|----------|-----------|
| BRIE2 | Probabilistic PSI | Handles sparsity, regulatory features |
| leafcutter2 | Intron clustering | NMD detection, novel junctions |
Note: Avoid Whippet.jl (Julia 1.6.7 only, incompatible with Julia 1.9+)
## BRIE2 Analysis
**Goal:** Estimate per-cell PSI values for splicing events with uncertainty quantification.
**Approach:** Prepare splicing events from annotation, count reads per cell barcode, then fit a Bayesian variational inference model for probabilistic PSI estimation.
**"Analyze splicing in single-cell data"** -> Estimate per-cell inclusion levels for splicing events with uncertainty.
- Python: BRIE2 (probabilistic PSI, handles sparsity)
- Python/R: leafcutter2 (intron clustering, NMD detection)
```python
import brie
import scanpy as sc
import anndata as ad
# Load single-cell data
adata = sc.read_h5ad('scrnaseq.h5ad')
# Prepare splicing events from annotation
# BRIE2 uses pre-defined splicing events
brie.preprocessing.get_events(
gtf_file='annotation.gtf',
out_file='splicing_events.gff3'
)
# Count reads for splicing events from BAM files
# Requires cell barcodes and UMIs
brie.preprocessing.count(
bam_file='possorted_genome_bam.bam',
gff_file='splicing_events.gff3',
out_dir='brie_counts/',
cell_file='barcodes.tsv' # Filtered cell barcodes
)
# Load BRIE count data
adata_splice = brie.read_h5ad('brie_counts/brie_count.h5ad')
# Run BRIE2 model for PSI estimation
# Uses variational inference for probabilistic estimates
brie.fit(
adata_splice,
layer='raw',
n_epochs=400,
batch_size=512
)
# PSI estimates stored in adata_splice.layers['Psi']
# Uncertainty in adata_splice.layers['Psi_var']
```
## Cell-Type Specific Splicing
**Goal:** Identify splicing events that vary between cell types.
**Approach:** Compute mean PSI per cell type from BRIE2 output and rank events by cross-cell-type variance.
```python
import numpy as np
import pandas as pd
# Add cell type annotations
adata_splice.obs['cell_type'] = adata.obs['cell_type']
# Calculate mean PSI per cell type
cell_types = adata_splice.obs['cell_type'].unique()
psi_matrix = adata_splice.layers['Psi']
mean_psi = pd.DataFrame(index=adata_splice.var_names)
for ct in cell_types:
mask = adata_splice.obs['cell_type'] == ct
mean_psi[ct] = np.nanmean(psi_matrix[mask, :], axis=0)
# Find cell-type specific splicing events
# Events with high variance across cell types
psi_var = mean_psi.var(axis=1)
variable_events = psi_var.nlargest(100)
print('Top variable splicing events:')
print(variable_events)
```
## leafcutter2 Analysis
**Goal:** Detect differential intron usage in single-cell data with NMD-inducing splicing detection.
**Approach:** Extract junctions from 10X BAMs with cell barcodes, cluster introns, and run differential analysis between cell groups.
```python
import subprocess
# leafcutter2 (April 2025): Adds NMD-inducing splicing detection
# Step 1: Extract junctions from BAM
# Works with 10X BAMs with cell barcodes
subprocess.run([
'python', 'scripts/bam2junc.py',
'-b', 'possorted_genome_bam.bam',
'-o', 'junctions/',
'--cb_tag', 'CB', # Cell barcode tag
'--umi_tag', 'UB' # UMI tag
], check=True)
# Step 2: Cluster introns
subprocess.run([
'python', 'clustering/leafcutter_cluster.py',
'-j', 'junction_files.txt',
'-o', 'leafcutter_sc',
'-m', '10', # Min reads per junction
'-l', '500000' # Max intron length
], check=True)
# Step 3: Differential splicing between clusters
# Pseudobulk approach for statistical power
subprocess.run([
'Rscript', 'scripts/leafcutter_ds.R',
'leafcutter_sc_perind_numers.counts.gz',
'groups.txt',
'-o', 'differential_splicing',
'-e', 'annotation_exons.txt.gz'
], check=True)
```
## Pseudobulk Approach
**Goal:** Increase statistical power for splicing analysis by aggregating single cells into pseudobulk samples.
**Approach:** Sum junction counts within cell type groups, then apply bulk differential splicing methods to the aggregated counts.
```python
import pandas as pd
import numpy as np
# For better statistical power, aggregate cells by type
def pseudobulk_junctions(junction_counts, cell_metadata, groupby='cell_type'):
'''Aggregate junction counts by cell group.'''
groups = cell_metadata.groupby(groupby).groups
pseudobulk = {}
for group, cells in groups.items():
cell_mask = junction_counts.index.isin(cells)
pseudobulk[group] = junction_counts.loc[cell_mask].sum()
return pd.DataFrame(pseudobulk)
# Run differential splicing on pseudobulk
# Use leafcutter or rMATS on aggregated counts
```
## Interpretation Considerations
| Challenge | Mitigation |
|-----------|------------|
| Sparse data | BRIE2 probabilistic model, pseudobulk |
| Low reads per cell | Aggregate similar cells |
| 3' bias (10X) | Use 5' kit or full-length methods |
| Doublets | Filter before splicing analysis |
## Quality Thresholds
| Metric | Recommendation |
|--------|----------------|
| Min cells per event | >= 50 with reads |
| Min reads per junction | >= 5 per cell with coverage |
| PSI confidence | Variance < 0.1 |
## Related Skills
- single-cell/preprocessing - QC before splicing analysis
- single-cell/clustering - Cell type annotation
- splicing-quantification - Bulk RNA-seq comparisonRelated Skills
tooluniverse-single-cell
Production-ready single-cell and expression matrix analysis using scanpy, anndata, and scipy. Performs scRNA-seq QC, normalization, PCA, UMAP, Leiden/Louvain clustering, differential expression (Wilcoxon, t-test, DESeq2), cell type annotation, per-cell-type statistical analysis, gene-expression correlation, batch correction (Harmony), trajectory inference, and cell-cell communication analysis. NEW: Analyzes ligand-receptor interactions between cell types using OmniPath (CellPhoneDB, CellChatDB), scores communication strength, identifies signaling cascades, and handles multi-subunit receptor complexes. Integrates with ToolUniverse gene annotation tools (HPA, Ensembl, MyGene, UniProt) and enrichment tools (gseapy, PANTHER, STRING). Supports h5ad, 10X, CSV/TSV count matrices, and pre-annotated datasets. Use when analyzing single-cell RNA-seq data, studying cell-cell interactions, performing cell type differential expression, computing gene-expression correlations by cell type, analyzing tumor-immune communication, or answering questions about scRNA-seq datasets.
single-trajectory-analysis
Guide to reproducing OmicVerse trajectory workflows spanning PAGA, Palantir, VIA, velocity coupling, and fate scoring notebooks.
single2spatial-spatial-mapping
Map scRNA-seq atlases onto spatial transcriptomics slides using omicverse's Single2Spatial workflow for deep-forest training, spot-level assessment, and marker visualisation.
single-cell-preprocessing-with-omicverse
Walk through omicverse's single-cell preprocessing tutorials to QC PBMC3k data, normalise counts, detect HVGs, and run PCA/embedding pipelines on CPU, CPU–GPU mixed, or GPU stacks.
single-cell-multi-omics-integration
Quick-reference sheet for OmicVerse tutorials spanning MOFA, GLUE pairing, SIMBA integration, TOSICA transfer, and StaVIA cartography.
single-cell-downstream-analysis
Checklist-style reference for OmicVerse downstream tutorials covering AUCell scoring, metacell DEG, and related exports.
single-cell-clustering-and-batch-correction-with-omicverse
Guide Claude through omicverse's single-cell clustering workflow, covering preprocessing, QC, multimethod clustering, topic modeling, cNMF, and cross-batch integration as demonstrated in t_cluster.ipynb and t_single_batch.ipynb.
single-cell-cellphonedb-communication-mapping
Run omicverse's CellPhoneDB v5 wrapper on annotated single-cell data to infer ligand-receptor networks and produce CellChat-style visualisations.
single-cell-rna-qc
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.
single-cell-annotation-skills-with-omicverse
Guide Claude through SCSA, MetaTiME, CellVote, CellMatch, GPTAnno, and weighted KNN transfer workflows for annotating single-cell modalities.
cellxgene-census
Query CZ CELLxGENE Census (61M+ cells). Filter by cell type/tissue/disease, retrieve expression data, integrate with scanpy/PyTorch, for population-scale single-cell analysis.
cell-free-expression
Guidance for cell-free protein synthesis (CFPS) optimization. Use when: (1) Planning CFPS experiments, (2) Troubleshooting low yield or aggregation, (3) Optimizing DNA template design for CFPS, (4) Expressing difficult proteins (disulfide-rich, toxic, membrane).