GIZMO is a multi-source biochemistry graph — Reactome reactions, StringDB PPI, gene-catalysis edges — operated on by a per-patient MAP reconstruction with a Laplacian smoothness prior. Each patient's projection decomposes into a signed β (projection onto the log-PageRank substrate backbone) and an α residual (orthogonal mechanism axis). The substrate is what makes mixed-panel cohorts comparable; the decomposition is what surfaces disease biology that PageRank-only and Cohen's-d-only baselines miss.
Map your hit list onto the graph: metabolites by HMDB / KEGG / PubChem, genes and transcripts by Ensembl / Entrez, proteins by UniProt — with fuzzy name matching and dataset-memory from prior corrections. Anchors pin reference nodes so real biomarkers don't drift under graph updates.
Each patient's heterogeneous measurements (metabolites, transcripts, proteins) are projected onto the biochem subgraph by minimising λ·fᵀLf + Σ (1/σ²)‖PKf−x‖² — a Laplacian smoothness prior with per-assay σ estimated from low-variance features. The result is one comparable signal vector per patient on the same ~38k-node biochem backbone.
Sequential OLS regresses each patient's projection on log-PageRank and Louvain community indicators. β is the signed projection onto the substrate's hub backbone (an intensity scalar); α is the orthogonal residual (a mechanism axis). In unimodal disease cohorts β tracks severity; in bimodal cohorts (e.g. IDH-mutant vs IDH-wildtype glioma) β-sign separates subtypes unsupervised.
Classical hypergeometric ORA against Reactome pathways with per-pathway fold enrichment and BH-FDR. Foreground = your hit list; background = measured HMDBs (or community-derived). Modules from the supervised arm bridge across pathways rather than recapitulating them — the cross-pathway-bridging claim is the one that survives matched-K head to head against WGCNA.
The win against PageRank-only and Cohen's-d-only baselines is structural, not absolute: GIZMO surfaces a set of disease anchor genes at median PageRank percentile 52 (mid-substrate, MW p=0.002 vs the Cohen's-d null) that are invisible to either baseline alone. Comparator scores on flat metrics tie DIABLO, MOFA+, SNF — uniqueness is what the framework actually offers.
Single λ-multiplier (0.1) calibrated on Crohn (n=33), applied unchanged across every cohort. CG solver with Jacobi preconditioning; pinned graph bundle (v1.0.0 — 79,998 nodes / 298,906 edges; ~38k after hub-cap). All ten hyperparameters disclosed in-paper. Every run is reproducible against the named active graph version.
Adult-type diffuse glioma splits cleanly along IDH1/IDH2 mutation status — a 5-year-survival-defining axis that drives standard of care. The benchmark: can a single-modality projection through GIZMO separate IDH-mutant from IDH-wildtype patients without any label supervision? The signed β-projection does it in two different cohorts with two different modalities.
On flat benchmark metrics GIZMO ties DIABLO, MOFA+, and SNF. What's unique is structural: substrate-mediated graph inference accesses mid-PageRank disease biology those methods don't surface, and the β/α decomposition functions as an intensity scalar in unimodal disease cohorts and as a subtype-direction indicator in bimodal ones — same projection, two regimes.
Bundle v1.0.0. Reactome + StringDB PPI + gene catalysis. Hub-cap (k=200) is what the per-patient MAP solver actually operates on; full graph is for inspection and downstream enrichment. Every production run is anchored to a single named active graph version.
| Capability | GIZMO | DIABLO | MOFA+ | SNF | MetaboAnalyst |
|---|---|---|---|---|---|
| Multi-omic patient stratification (flat metric) | ties (7/17 cells; MOFA wins 14/17 at matched-K) | ✓ (multi-block PLS) | ✓ (probabilistic FA) | ✓ (similarity fusion) | — |
| Panel-agnostic cross-cohort comparability | ✓ 28× Jaccard ratio (Su↔Filbin COVID) | — per-cohort latent space | — per-cohort factor space | — per-cohort similarity | — |
| Mid-substrate disease anchor recovery | ✓ 286 anchors at PR-percentile 52 (p=0.002) | — driven by top-loaded features | — driven by top-weighted factors | — driven by k-NN topology | via pathway enrichment |
| Signed β / α decomposition (intensity vs mechanism) | ✓ unimodal & bimodal regimes | — | — | — | — |
| Unsupervised subtype recovery from a single modality | ✓ IDH-glioma β AUC 0.805 (NMR n=88) | — requires multi-block | — factor selection ambiguous | via clustering | — |
| Mechanistic pathway / ORA on the same projection | ✓ Louvain modules → Reactome ORA | — | — | — | ✓ standalone ORA |
DIABLO, MOFA+, and SNF are mature multi-block / factor models that perform well on flat patient-stratification metrics — GIZMO ties them, it doesn't dominate. What GIZMO uniquely offers is the substrate: a fixed biochemistry graph that makes mixed-panel cohorts comparable, surfaces mid-PageRank disease genes invisible to top-feature-driven methods, and supplies the same projection in two regimes — β as severity scalar in unimodal cohorts, β-sign as subtype direction in bimodal ones.
Upload your metabolite list, map to graph nodes, run analysis. The active graph is shared across every Forge user, so results are comparable across datasets.
Request Forge access →For biotechs with a compound panel and a disease area. We return a ranked target list with causal chains, counterfactual KO deltas, and druggability scores. Typically in 6–8 weeks from NDA signing.
Scope a project →Need a proprietary target set, phenotype ontology, or chemistry corpus integrated into the graph additively? We extend the active graph under contract, keeping public nodes intact.
Discuss a custom layer →GIZMO's method is described across two single-author Cell Systems submissions: the supervised paper covers MAP reconstruction, β/α decomposition, anchor-gene recovery against DIABLO / MOFA+ / SNF, and the IDH-glioma flagship; the unsupervised companion covers panel-agnostic cross-cohort comparability (28× Jaccard ratio on Su↔Filbin COVID). Both are in preparation, with bioRxiv preprints planned before submission.
11 cohorts in the validation ladder (Crohn, Su_COVID, Filbin_COVID, IDH_glioma, Erawijantari, Gao_RA, TCGA_IDH, TCGA_LUAD, KMPLOT_BRCA, GSE89408_RA, HMP2_IBD, CorEvitas_RA). Benchmark suite, admin editor for custom graph layers, and the pinned v1.0.0 graph bundle ship with the platform. For Joey's essays and related work see insilijo.github.io.
code
github.com/insilijo/GIZMO
platform
github.com/insilijo/forge