bio-multi-omics-similarity-network

Similarity Network Fusion (SNF) for patient stratification using multi-omics data. Integrates multiple data types into a unified patient similarity network. Use when performing patient stratification or integrating multi-omics data into unified similarity networks.

1,802 stars

Best use case

bio-multi-omics-similarity-network is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Similarity Network Fusion (SNF) for patient stratification using multi-omics data. Integrates multiple data types into a unified patient similarity network. Use when performing patient stratification or integrating multi-omics data into unified similarity networks.

Teams using bio-multi-omics-similarity-network should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-multi-omics-similarity-network/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/bio-multi-omics-similarity-network/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/bio-multi-omics-similarity-network/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How bio-multi-omics-similarity-network Compares

Feature / Agentbio-multi-omics-similarity-networkStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Similarity Network Fusion (SNF) for patient stratification using multi-omics data. Integrates multiple data types into a unified patient similarity network. Use when performing patient stratification or integrating multi-omics data into unified similarity networks.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

## Version Compatibility

Reference examples tested with: scanpy 1.10+

Before using code patterns, verify installed versions match. If versions differ:
- R: `packageVersion('<pkg>')` then `?function_name` to verify parameters

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Similarity Network Fusion

**"Stratify patients using multi-omics data"** → Fuse omics-specific patient similarity networks into a unified network for subtype discovery and clustering.
- R: `SNFtool::SNF()` to fuse networks, `spectralClustering()` for subtyping

## Basic SNF Workflow

**Goal:** Fuse multiple omics-specific patient similarity networks into a single unified network.

**Approach:** Compute per-omics distance and affinity matrices, then iteratively fuse with SNF.

```r
library(SNFtool)

# Load omics data (samples x features)
data1 <- as.matrix(read.csv('rnaseq.csv', row.names = 1))
data2 <- as.matrix(read.csv('methylation.csv', row.names = 1))
data3 <- as.matrix(read.csv('mirna.csv', row.names = 1))

# Ensure matching samples
common <- Reduce(intersect, list(rownames(data1), rownames(data2), rownames(data3)))
data1 <- data1[common, ]
data2 <- data2[common, ]
data3 <- data3[common, ]

# Compute distance matrices
dist1 <- dist2(as.matrix(data1), as.matrix(data1))
dist2 <- dist2(as.matrix(data2), as.matrix(data2))
dist3 <- dist2(as.matrix(data3), as.matrix(data3))

# Construct affinity matrices
# K = number of neighbors, alpha = hyperparameter
K <- 20
alpha <- 0.5

aff1 <- affinityMatrix(dist1, K, alpha)
aff2 <- affinityMatrix(dist2, K, alpha)
aff3 <- affinityMatrix(dist3, K, alpha)

# Fuse networks
# T = number of iterations
fused <- SNF(list(aff1, aff2, aff3), K = K, t = 20)
```

## Cluster Patients

**Goal:** Identify patient subtypes from the fused similarity network using spectral clustering.

**Approach:** Estimate optimal cluster count from the fused graph, then apply spectral clustering.

```r
# Determine optimal number of clusters
estimateNumberOfClustersGivenGraph(fused, NUMC = 2:10)

# Spectral clustering
num_clusters <- 3
clusters <- spectralClustering(fused, num_clusters)

# Add to sample metadata
sample_info <- data.frame(
    Sample = rownames(data1),
    Cluster = factor(clusters)
)
```

## Visualize Network

**Goal:** Display the fused patient network as a graph and heatmap with cluster annotations.

**Approach:** Convert the fused matrix to an igraph object, filter weak edges, and render with cluster coloring.

```r
library(igraph)

# Convert to igraph
g <- graph_from_adjacency_matrix(fused, mode = 'undirected', weighted = TRUE, diag = FALSE)

# Remove weak edges
threshold <- quantile(E(g)$weight, 0.9)
g_filtered <- delete_edges(g, E(g)[weight < threshold])

# Plot
V(g_filtered)$color <- clusters
plot(g_filtered, vertex.size = 5, vertex.label = NA,
     edge.width = E(g_filtered)$weight * 2,
     main = 'SNF Patient Network')

# Heatmap
library(pheatmap)
pheatmap(fused, cluster_rows = TRUE, cluster_cols = TRUE,
         annotation_row = sample_info['Cluster'],
         show_rownames = FALSE, show_colnames = FALSE)
```

## Normalized Mutual Information

**Goal:** Evaluate clustering quality by comparing SNF clusters against known subtypes and single-omics baselines.

**Approach:** Compute NMI between predicted clusters and true labels for fused vs individual affinity networks.

```r
# Compare with known labels
true_labels <- read.csv('phenotype.csv')$Subtype

# NMI score
nmi <- calNMI(clusters, true_labels)
cat('NMI:', nmi, '\n')

# Compare individual vs fused
nmi_rna <- calNMI(spectralClustering(aff1, num_clusters), true_labels)
nmi_meth <- calNMI(spectralClustering(aff2, num_clusters), true_labels)
nmi_mirna <- calNMI(spectralClustering(aff3, num_clusters), true_labels)

cat('NMI RNA only:', nmi_rna, '\n')
cat('NMI Methylation only:', nmi_meth, '\n')
cat('NMI miRNA only:', nmi_mirna, '\n')
cat('NMI Fused:', nmi, '\n')
```

## Feature Ranking with SNF

**Goal:** Rank features by their contribution to the SNF-derived patient clusters.

**Approach:** Perform ANOVA per feature across cluster assignments, ranking by F-statistic p-value.

```r
# Rank features by their contribution to clustering
# Using network-based method

# For each omics layer
rank_features <- function(data, clusters) {
    # Calculate feature importance based on cluster separation
    f_values <- apply(data, 2, function(x) {
        summary(aov(x ~ factor(clusters)))[[1]][1, 4]
    })
    f_values[is.na(f_values)] <- 1
    names(sort(f_values))
}

top_rna <- rank_features(data1, clusters)
top_meth <- rank_features(data2, clusters)
```

## Survival Analysis with Clusters

**Goal:** Assess clinical relevance of SNF clusters by comparing survival outcomes between subtypes.

**Approach:** Fit Kaplan-Meier curves per cluster and test significance with the log-rank test.

```r
library(survival)
library(survminer)

# Load survival data
surv_data <- read.csv('survival.csv')
surv_data$Cluster <- clusters[match(surv_data$Sample, rownames(data1))]

# Kaplan-Meier
fit <- survfit(Surv(Time, Event) ~ Cluster, data = surv_data)

ggsurvplot(fit, data = surv_data, pval = TRUE,
           risk.table = TRUE, palette = 'jco',
           title = 'SNF Cluster Survival')

# Log-rank test
survdiff(Surv(Time, Event) ~ Cluster, data = surv_data)
```

## Parameter Tuning

**Goal:** Optimize SNF hyperparameters (K neighbors, alpha) for best clustering performance.

**Approach:** Grid search over K and alpha values, evaluating each combination by NMI against known labels.

```r
# Grid search over K and alpha
K_range <- c(10, 20, 30)
alpha_range <- c(0.3, 0.5, 0.8)

results <- expand.grid(K = K_range, alpha = alpha_range, NMI = NA)

for (i in 1:nrow(results)) {
    aff1 <- affinityMatrix(dist1, results$K[i], results$alpha[i])
    aff2 <- affinityMatrix(dist2, results$K[i], results$alpha[i])
    aff3 <- affinityMatrix(dist3, results$K[i], results$alpha[i])

    fused <- SNF(list(aff1, aff2, aff3), K = results$K[i], t = 20)
    clusters <- spectralClustering(fused, num_clusters)
    results$NMI[i] <- calNMI(clusters, true_labels)
}

best <- results[which.max(results$NMI), ]
cat('Best parameters: K =', best$K, ', alpha =', best$alpha, '\n')
```

## Integration with Clinical Features

**Goal:** Incorporate clinical variables as an additional data view in the SNF fusion.

**Approach:** Encode clinical features numerically, compute a clinical affinity matrix, and include it in the SNF fusion step.

```r
# Add clinical features as another view
clinical <- read.csv('clinical.csv', row.names = 1)
clinical_numeric <- model.matrix(~ . - 1, data = clinical)

dist_clinical <- dist2(clinical_numeric, clinical_numeric)
aff_clinical <- affinityMatrix(dist_clinical, K, alpha)

# Fuse all including clinical
fused_with_clinical <- SNF(list(aff1, aff2, aff3, aff_clinical), K = K, t = 20)
```

## Related Skills

- mofa-integration - Factor-based integration
- mixomics-analysis - Supervised integration
- single-cell/clustering - Single-cell clustering methods

Related Skills

tooluniverse-spatial-transcriptomics

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Analyze spatial transcriptomics data to map gene expression in tissue architecture. Supports 10x Visium, MERFISH, seqFISH, Slide-seq, and imaging-based platforms. Performs spatial clustering, domain identification, cell-cell proximity analysis, spatial gene expression patterns, tissue architecture mapping, and integration with single-cell data. Use when analyzing spatial transcriptomics datasets, studying tissue organization, identifying spatial expression patterns, mapping cell-cell interactions in tissue context, characterizing tumor microenvironment spatial structure, or integrating spatial and single-cell RNA-seq data for comprehensive tissue analysis.

tooluniverse-spatial-omics-analysis

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Computational analysis framework for spatial multi-omics data integration. Given spatially variable genes (SVGs), spatial domain annotations, tissue type, and disease context from spatial transcriptomics/proteomics experiments (10x Visium, MERFISH, DBiTplus, SLIDE-seq, etc.), performs comprehensive biological interpretation including pathway enrichment, cell-cell interaction inference, druggable target identification, immune microenvironment characterization, and multi-modal integration. Produces a detailed markdown report with Spatial Omics Integration Score (0-100), domain-by-domain characterization, and validation recommendations. Uses 70+ ToolUniverse tools across 9 analysis phases. Use when users ask about spatial transcriptomics analysis, spatial omics interpretation, tissue heterogeneity, spatial gene expression patterns, tumor microenvironment mapping, tissue zonation, or cell-cell communication from spatial data.

tooluniverse-proteomics-analysis

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Analyze mass spectrometry proteomics data including protein quantification, differential expression, post-translational modifications (PTMs), and protein-protein interactions. Processes MaxQuant, Spectronaut, DIA-NN, and other MS platform outputs. Performs normalization, statistical analysis, pathway enrichment, and integration with transcriptomics. Use when analyzing proteomics data, comparing protein abundance between conditions, identifying PTM changes, studying protein complexes, integrating protein and RNA data, discovering protein biomarkers, or conducting quantitative proteomics experiments.

protein-interaction-network-analysis

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Analyze protein-protein interaction networks using STRING, BioGRID, and SASBDB databases. Maps protein identifiers, retrieves interaction networks with confidence scores, performs functional enrichment analysis (GO/KEGG/Reactome), and optionally includes structural data. No API key required for core functionality (STRING). Use when analyzing protein networks, discovering interaction partners, identifying functional modules, or studying protein complexes.

tooluniverse-network-pharmacology

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Construct and analyze compound-target-disease networks for drug repurposing, polypharmacology discovery, and systems pharmacology. Builds multi-layer networks from ChEMBL, OpenTargets, STRING, DrugBank, Reactome, FAERS, and 60+ other ToolUniverse tools. Calculates Network Pharmacology Scores (0-100), identifies repurposing candidates, predicts mechanisms, and analyzes polypharmacology. Use when users ask about drug repurposing via network analysis, multi-target drug effects, compound-target-disease networks, systems pharmacology, or polypharmacology.

tooluniverse-multiomic-disease-characterization

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Comprehensive multi-omics disease characterization integrating genomics, transcriptomics, proteomics, pathway, and therapeutic layers for systems-level understanding. Produces a detailed multi-omics report with quantitative confidence scoring (0-100), cross-layer gene concordance analysis, biomarker candidates, therapeutic opportunities, and mechanistic hypotheses. Uses 80+ ToolUniverse tools across 8 analysis layers. Use when users ask about disease mechanisms, multi-omics analysis, systems biology of disease, biomarker discovery, or therapeutic target identification from a disease perspective.

tooluniverse-multi-omics-integration

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Integrate and analyze multiple omics datasets (transcriptomics, proteomics, epigenomics, genomics, metabolomics) for systems biology and precision medicine. Performs cross-omics correlation, multi-omics clustering (MOFA+, NMF), pathway-level integration, and sample matching. Coordinates ToolUniverse skills for expression data (RNA-seq), epigenomics (methylation, ChIP-seq), variants (SNVs, CNVs), protein interactions, and pathway enrichment. Use when analyzing multi-omics datasets, performing integrative analysis, discovering multi-omics biomarkers, studying disease mechanisms across molecular layers, or conducting systems biology research that requires coordinated analysis of transcriptome, genome, epigenome, proteome, and metabolome data.

tooluniverse-metabolomics

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Comprehensive metabolomics research skill for identifying metabolites, analyzing studies, and searching metabolomics databases. Integrates HMDB (220k+ metabolites), MetaboLights, Metabolomics Workbench, and PubChem. Use when asked to identify or annotate metabolites (HMDB IDs, chemical properties, pathways), retrieve metabolomics study information from MetaboLights (MTBLS*) or Metabolomics Workbench (ST*), search for studies by keywords or disease, or generate comprehensive metabolomics research reports.

tooluniverse-metabolomics-analysis

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Analyze metabolomics data including metabolite identification, quantification, pathway analysis, and metabolic flux. Processes LC-MS, GC-MS, NMR data from targeted and untargeted experiments. Performs normalization, statistical analysis, pathway enrichment, metabolite-enzyme integration, and biomarker discovery. Use when analyzing metabolomics datasets, identifying differential metabolites, studying metabolic pathways, integrating with transcriptomics/proteomics, discovering metabolic biomarkers, performing flux balance analysis, or characterizing metabolic phenotypes in disease, drug response, or physiological conditions.

tooluniverse-epigenomics

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Production-ready genomics and epigenomics data processing for BixBench questions. Handles methylation array analysis (CpG filtering, differential methylation, age-related CpG detection, chromosome-level density), ChIP-seq peak analysis (peak calling, motif enrichment, coverage stats), ATAC-seq chromatin accessibility, multi-omics integration (expression + methylation correlation), and genome-wide statistics. Pure Python computation (pandas, scipy, numpy, pysam, statsmodels) plus ToolUniverse annotation tools (Ensembl, ENCODE, SCREEN, JASPAR, ReMap, RegulomeDB, ChIPAtlas). Supports BED, BigWig, methylation beta-value matrices, Illumina manifest files, and multi-sample clinical data. Use when processing methylation data, ChIP-seq peaks, ATAC-seq signals, or answering questions about CpG sites, differential methylation, chromatin accessibility, histone marks, or epigenomic statistics.

spatial-transcriptomics-tutorials-with-omicverse

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Guide users through omicverse's spatial transcriptomics tutorials covering preprocessing, deconvolution, and downstream modelling workflows across Visium, Visium HD, Stereo-seq, and Slide-seq datasets.

single-cell-multi-omics-integration

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Quick-reference sheet for OmicVerse tutorials spanning MOFA, GLUE pairing, SIMBA integration, TOSICA transfer, and StaVIA cartography.