Background: Gliomas are characterized by a high degree of molecular heterogeneity, which impairs the reproducibility of predictive biomarkers derived from bulk-based molecular profiling due to immune/stromal contamination of tumors and the high prevalence of the IDH mutation signature. Methods: In this study, we used MOFA+ to derive intrinsic molecular signatures from transcriptional, methylation, and genomic profiles of a cohort of 667 diffuse gliomas in the Cancer Genome Atlas database. Thereafter, factor scores were derived for two separate Chinese Glioma Genome Atlas batches (Batch 1,
n = 325; Batch 2,
n = 693) without any retraining on the model. The prognostic independence of identified molecular signatures was assessed using multivariable Cox regression adjusted for IDH mutation status and tumor purity; purity-residualized survival analyses; IDH-stratified Cox regression in each cohort; validation by concordance index against established molecular signatures; and survival extreme profiling. To characterize the biological significance of factor signatures, we projected gene set signatures corresponding to each factor signature onto a single-cell RNA-seq dataset of GBM (GSE131928). Results: MOFA+ identified 12 latent factors, of which a vascular–extracellular matrix (ECM) remodeling axis (Factor 1) explained the highest multi-omics variance (24.9%) and was the strongest independent prognostic factor. In multivariable Cox regression adjusting for IDH status and tumor purity, Factor 1 remained independently prognostic (HR = 1.67, 95% CI 1.27–2.20,
p = 0.0002); in a fully-adjusted model additionally including age, WHO grade, MGMT methylation, and 1p/19q codeletion (plus radiotherapy and chemotherapy status in the CGGA cohorts), Factor 1 remained prognostic in both CGGA cohorts (CGGA1: HR = 1.50,
p = 3.8 × 10
−5; CGGA2: HR = 1.18,
p = 0.003) but lost significance in TCGA (HR = 1.04,
p = 0.83), consistent with the cohort-dependent magnitude reported in the IDH-stratified and meta-regression analyses below. Purity-residualized survival analysis showed negligible attenuation of the Factor 1 signal (raw HR = 3.57 vs. residualized HR = 3.72; concordance 96.5%). Within IDH-wildtype gliomas, Factor 1 was significant in both external validation cohorts (CGGA1: HR = 1.64, FDR = 4.6 × 10
−6; CGGA2: HR = 1.20, FDR = 0.02), though the TCGA IDH-wildtype subgroup showed a trend that did not survive FDR correction (FDR = 0.060). All validation was performed without model retraining. Within IDH-mutant gliomas, Factor 1 was strongly prognostic in both CGGA cohorts but was not significant in TCGA (HR = 1.17, FDR = 0.33). These findings should therefore be interpreted as consistent in directionality across cohorts but not uniformly replicated at the FDR-adjusted significance threshold in the TCGA discovery dataset. Concordance index benchmarking on a matched subset (
n = 503) showed Factor 1 achieved discrimination comparable to the Mesenchymal signature (C = 0.797 vs. 0.801; ΔC = −0.004) while outperforming four other established classifiers. Factor 1 consistently separated patients with extreme survival phenotypes (OS < 6 vs. >15 months) across all three cohorts (all log-rank
p < 0.001). Projection onto a single-cell GBM atlas (GSE131928), supported by inferCNV-based malignant-cell classification, localized the Vascular–ECM program to malignant cells and the Immune–ECM axis to myeloid compartments. Conclusions: The Vascular–ECM axis is a consistent, prognostic program robust to purity adjustment for diffuse gliomas that remains relevant across IDH-defined subgroups in three independent datasets comprising 1685 patients. The Vascular–ECM axis is a reproducible, purity-robust prognostic program in diffuse glioma, with directionally consistent adverse effects across TCGA, CGGA Batch 1, and CGGA Batch 2 (pooled
n = 1685). Given the strong co-loading of endothelial, ECM, and myeloid genes observed in the single-cell projection, Factor 1 is best interpreted as a vascular/ECM-associated tumor–microenvironment ecosystem program rather than a malignant-cell-autonomous signature. Its FDR-adjusted significance within IDH-stratified subgroups is cohort-dependent and robust in both CGGA cohorts but attenuated in the TCGA IDH-wildtype (FDR = 0.060) and TCGA IDH-mutant (FDR = 0.33) strata. The pooled signal should therefore be interpreted as evidence of a generalizable biological program rather than a uniformly replicated subgroup-specific biomarker. It is possible to calculate factor scores based on RNA sequencing alone using fixed loadings (Z = XWᵀ), which may have implications for future translational applications. All findings are correlative; a causal role for the Vascular–ECM program in glioma progression, invasion, or therapy resistance remains to be established through functional perturbation experiments.
Full article