Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.

Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.

Bayesian inference of lineage evolution and molecular profiling in cancer and gene therapy

BUSCAROLI, ELENA
2026

Abstract

Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.
5-feb-2026
Italiano
Cellular populations within multicellular organisms undergo continuous diversification driven by heritable molecular alterations that shape their evolutionary trajectories. This process, known as somatic evolution, arises from the interplay of mutation, selection, and drift, and manifests across biological systems from tissue homeostasis to malignant transformation. In this thesis, I develop Bayesian models for the quantitative inference of lineage evolution and molecular diversification, providing a unified statistical framework to study how cells expand, differentiate, and adapt in both therapeutic and pathological contexts. By integrating probabilistic inference with biological priors and heterogeneous sequencing data, the work establishes a comprehensive modelling perspective that connects clonal dynamics, mutational processes, and single-cell molecular profiling within a common evolutionary framework. In haematopoietic gene therapy, longitudinal integration-site (IS) sequencing enables quantitative tracing of gene-corrected lineages. Bayesian regression models are developed to quantify clonal abundance, lineage commitment, and selection dynamics while accounting for measurement uncertainty and incomplete detection. Extending this framework, Bayesian mixture models implemented in lineaGT jointly reconstruct clonal relationships and mutational profiles, linking lineage tracing with molecular evolution in reconstituted haematopoiesis. In cancer genomics, the thesis introduces BASCULE, a Bayesian model for mutational signatures inference and patients clustering across multiple variant types. By embedding biological priors within a non-negative matrix factorisation and a tensor clustering frameworks, BASCULE identifies known and novel mutational processes, revealing molecular subgroups and clinical associations across tumour types. Finally, Bayesian inference is extended to single-cell assays through a latent Gaussian process with Binomial likelihood, which captures phylogenetic dependencies to improve somatic variant detection from single-cell RNA sequencing data, connecting cellular-resolution profiles to the broader probabilistic description of clonal evolution. Altogether, this work establishes a coherent Bayesian framework for modelling somatic evolution across scales and systems. By transferring inferential methodologies between cancer and gene therapy, and by extending them to single-cell resolution, the thesis demonstrates that shared evolutionary and statistical principles can describe the dynamics of regeneration, adaptation, and malignancy within a unified mathematical language. This integrative approach provides methodological advances and biological insight, offering a principled route to interpret complex molecular data through the lens of evolution and uncertainty.
Gene Therapy; Cancer genomics; Bayesian models; Mutational signature; Single-cell seq
CARAVAGNA, GIULIO
Università degli Studi di Trieste
File in questo prodotto:
File Dimensione Formato  
FINAL_REV_phd_thesis_pdfA.pdf

embargo fino al 05/02/2027

Licenza: Tutti i diritti riservati
Dimensione 19.13 MB
Formato Adobe PDF
19.13 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/357311
Il codice NBN di questa tesi è URN:NBN:IT:UNITS-357311