Cancer Transcriptomics ML

Machine learning classification of tumour vs. normal tissue across 5 TCGA cancer types, filtered through two-scale evolutionary analysis to identify candidate cancer-maintaining gene dependencies.

163 candidate genes identified → 15 cross-validated across cancers → Established drivers confirmed (TP53, PIK3CA, PTEN)

💡 Central hypothesis: ML-predictive genes under both strong germline purifying selection (dN/dS < 0.3) AND somatic positive selection (dN/dS ≥ 1.5, FDR < 0.05) are candidate cancer-maintaining dependencies.

🧬 Cancer Type Overview

BRCA — Breast —

Bal. Accuracy—

Specificity—

AUC—

Samples—

Genes Tested—

BLCA — Bladder —

Bal. Accuracy—

Specificity—

AUC—

Samples—

Genes Tested—

PRAD — Prostate —

Bal. Accuracy—

Specificity—

AUC—

Samples—

Genes Tested—

LUAD — Lung Adeno. —

Bal. Accuracy—

Specificity—

AUC—

Samples—

Genes Tested—

UCEC — Uterine —

Bal. Accuracy—

Specificity—

AUC—

Samples—

Genes Tested—

🔗 Quick Links

📖 Explore Findings

Per-cancer narrative with key candidates and evidence

📊 Model Performance

Comparison of LR, RF, and MLP across 5 cancer types

🔬 Evolutionary Analysis

Germline conservation and somatic selection filtering