Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.
Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.
Bayesian inference of lineage evolution and molecular profiling in cancer and gene therapy
BUSCAROLI, ELENA
2026
Abstract
Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.| File | Dimensione | Formato | |
|---|---|---|---|
|
FINAL_REV_phd_thesis_pdfA.pdf
embargo fino al 05/02/2027
Licenza:
Tutti i diritti riservati
Dimensione
19.13 MB
Formato
Adobe PDF
|
19.13 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/357311
URN:NBN:IT:UNITS-357311