PROBABILISTIC AND DEEP LEARNING APPROACHES TO MODELING BIOLOGICAL SYSTEMS

Milite, Salvatore

This thesis investigates probabilistic and deep learning methods for modeling biological systems across various scales, with a specific focus on cancer. The aim is to develop models that are both quantitatively rigorous and biologically meaningful. In the first part, I present a hierarchical Bayesian extension of MOBSTER, a model-based clustering approach for subclonal deconvolution. By introducing a probabilistic hierarchical structure, this method accounts for observational noise, for shared uncertainty across karyotypes, and allows inference on a substantially bigger proportion of the genome. I then apply it to a large cohort of whole-genome sequencing data, where I analyze clonal dynamics, mutation rates, and driver selection patterns. The second part deals with single-cell and multi-omic data integration, as well as the study of cancer cell plasticity and resistance to therapy. I present MIDAA, a deep archetypal analysis method that provides interpretable latent representations and show its performance on two different hematopoiesis multiomics datasets. Building on this, I present a computational analysis of lentivirally barcoded patient-derived organoids to show that epigenetic heritability plays a key role in drug resistance. This leads to a unified view where stable epigenetic differences drive diverse cell states and adaptive responses to treatment. The final part focuses on spatial and temporal modeling. I develop the Mixture of Neural Cellular Automata as a stochastic framework for simulating tissue growth and image morphogenesis. I then show some preliminary results on designing an agent-based deep learning model to capture cell fate decisions from spatial transcriptomics data. Overall, this work shows how combining probabilistic and deep learning approaches can provide new ways to study complex biological processes, from cancer evolution to tissue organization, and offers general frameworks that can be applied to a broad range of problems in computational biology.

PROBABILISTIC AND DEEP LEARNING APPROACHES TO MODELING BIOLOGICAL SYSTEMS

MILITE, SALVATORE

2025

Abstract

This thesis investigates probabilistic and deep learning methods for modeling biological systems across various scales, with a specific focus on cancer. The aim is to develop models that are both quantitatively rigorous and biologically meaningful. In the first part, I present a hierarchical Bayesian extension of MOBSTER, a model-based clustering approach for subclonal deconvolution. By introducing a probabilistic hierarchical structure, this method accounts for observational noise, for shared uncertainty across karyotypes, and allows inference on a substantially bigger proportion of the genome. I then apply it to a large cohort of whole-genome sequencing data, where I analyze clonal dynamics, mutation rates, and driver selection patterns. The second part deals with single-cell and multi-omic data integration, as well as the study of cancer cell plasticity and resistance to therapy. I present MIDAA, a deep archetypal analysis method that provides interpretable latent representations and show its performance on two different hematopoiesis multiomics datasets. Building on this, I present a computational analysis of lentivirally barcoded patient-derived organoids to show that epigenetic heritability plays a key role in drug resistance. This leads to a unified view where stable epigenetic differences drive diverse cell states and adaptive responses to treatment. The final part focuses on spatial and temporal modeling. I develop the Mixture of Neural Cellular Automata as a stochastic framework for simulating tissue growth and image morphogenesis. I then show some preliminary results on designing an agent-based deep learning model to capture cell fate decisions from spatial transcriptomics data. Overall, this work shows how combining probabilistic and deep learning approaches can provide new ways to study complex biological processes, from cancer evolution to tissue organization, and offers general frameworks that can be applied to a broad range of problems in computational biology.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Oncologia ed Emato-Oncologia
			
	Corso di studio
	
				MEDICINA DEI SISTEMI
			
	Data di pubblicazione
	
				16-dic-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				SOTTORIVA, ANDREA
			
	Nome Editore
	
				Università degli Studi di Milano
			
	Numero di pagine
	
				158
			
	Collezione di appartenenza
	
				Università degli Studi di Milano

File in questo prodotto:

File	Dimensione	Formato
phd_unimi_R13490.pdf embargo fino al 18/11/2026 Licenza: Creative Commons Dimensione 63.38 MB Formato Adobe PDF	63.38 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/353687

Il codice NBN di questa tesi è URN:NBN:IT:UNIMI-353687