This thesis investigates probabilistic and deep learning methods for modeling biological systems across various scales, with a specific focus on cancer. The aim is to develop models that are both quantitatively rigorous and biologically meaningful. In the first part, I present a hierarchical Bayesian extension of MOBSTER, a model-based clustering approach for subclonal deconvolution. By introducing a probabilistic hierarchical structure, this method accounts for observational noise, for shared uncertainty across karyotypes, and allows inference on a substantially bigger proportion of the genome. I then apply it to a large cohort of whole-genome sequencing data, where I analyze clonal dynamics, mutation rates, and driver selection patterns. The second part deals with single-cell and multi-omic data integration, as well as the study of cancer cell plasticity and resistance to therapy. I present MIDAA, a deep archetypal analysis method that provides interpretable latent representations and show its performance on two different hematopoiesis multiomics datasets. Building on this, I present a computational analysis of lentivirally barcoded patient-derived organoids to show that epigenetic heritability plays a key role in drug resistance. This leads to a unified view where stable epigenetic differences drive diverse cell states and adaptive responses to treatment. The final part focuses on spatial and temporal modeling. I develop the Mixture of Neural Cellular Automata as a stochastic framework for simulating tissue growth and image morphogenesis. I then show some preliminary results on designing an agent-based deep learning model to capture cell fate decisions from spatial transcriptomics data. Overall, this work shows how combining probabilistic and deep learning approaches can provide new ways to study complex biological processes, from cancer evolution to tissue organization, and offers general frameworks that can be applied to a broad range of problems in computational biology.
PROBABILISTIC AND DEEP LEARNING APPROACHES TO MODELING BIOLOGICAL SYSTEMS
MILITE, SALVATORE
2025
Abstract
This thesis investigates probabilistic and deep learning methods for modeling biological systems across various scales, with a specific focus on cancer. The aim is to develop models that are both quantitatively rigorous and biologically meaningful. In the first part, I present a hierarchical Bayesian extension of MOBSTER, a model-based clustering approach for subclonal deconvolution. By introducing a probabilistic hierarchical structure, this method accounts for observational noise, for shared uncertainty across karyotypes, and allows inference on a substantially bigger proportion of the genome. I then apply it to a large cohort of whole-genome sequencing data, where I analyze clonal dynamics, mutation rates, and driver selection patterns. The second part deals with single-cell and multi-omic data integration, as well as the study of cancer cell plasticity and resistance to therapy. I present MIDAA, a deep archetypal analysis method that provides interpretable latent representations and show its performance on two different hematopoiesis multiomics datasets. Building on this, I present a computational analysis of lentivirally barcoded patient-derived organoids to show that epigenetic heritability plays a key role in drug resistance. This leads to a unified view where stable epigenetic differences drive diverse cell states and adaptive responses to treatment. The final part focuses on spatial and temporal modeling. I develop the Mixture of Neural Cellular Automata as a stochastic framework for simulating tissue growth and image morphogenesis. I then show some preliminary results on designing an agent-based deep learning model to capture cell fate decisions from spatial transcriptomics data. Overall, this work shows how combining probabilistic and deep learning approaches can provide new ways to study complex biological processes, from cancer evolution to tissue organization, and offers general frameworks that can be applied to a broad range of problems in computational biology.| File | Dimensione | Formato | |
|---|---|---|---|
|
phd_unimi_R13490.pdf
embargo fino al 18/11/2026
Licenza:
Creative Commons
Dimensione
63.38 MB
Formato
Adobe PDF
|
63.38 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/353687
URN:NBN:IT:UNIMI-353687