DEVELOPMENT OF A BIOINFORMATIC SUITE TO IMPROVE THE CLINICAL IMPLEMENTATION OF NEXT GENERATION SEQUENCING IN PRECISION MEDICINE FOR ONCOLOGY

Vozza, Gianluca

The identification of clinically relevant variants has been moving through a data-driven path, and a large number of projects have been launched with the aim of favoring the implementation of NGS-based molecular profiling, particularly to define diagnostic, prognostic, and therapeutic pathways for cancer patients. It would be particularly useful to map the country-level mutational prevalence in CPGs in both the affected population and their relatives to assess the degree of heritability of specific mutations predisposing to different types of solid tumors such as breast, ovarian, and colon cancers. Additionally, a comprehensive profiling of the tumor genome of patients is needed to detect variants that may indicate a potential therapeutic response to new drug treatments. In order to do that, action is required within a structured network composed of the main oncology institutions across the country, through collaborative efforts as ACC, and the implementation of a centralized system for the storage, analysis, and interpretation of large volumes of data. For the purpose of centralized analysis and interpretation, we have developed multiple bioinformatic tools to: i) Perform rigorous QC on the reads used in the clinical setting to detect any coverage biases in clinically relevant regions. ii) Perform benchmarking and standardization of variant calling pipelines through the development of a software that simplifies fine-tuning and reduces the time required for the curation of potential FNs. This software should also harmonize the various variant notations in VCF files from major variant callers in a precise and reproducible manner. iii) Interpret and report variants based on ACMG guidelines (2015 for germline variants and 2017 for somatic variants) and develop ML-based algorithms for assessing the pathogenicity of variants with uncertain significance. This work converged into the development of multiple bioinformatic tools: i) Covdetect for detecting coverage biases in relevant genomic regions as where VOIs lie; ii) RecallME to harmonize different variant notation formats and to benchmark VC pipelines on a set of expected variants; iii) MolProBoard to visualize and interpret reported VOIs iv) RENOVO to assess the pathogenicity level of VUS. These tools were then bundled into a single suite for implementing NGS-derived data within the clinical settings, as presented in this thesis.

DEVELOPMENT OF A BIOINFORMATIC SUITE TO IMPROVE THE CLINICAL IMPLEMENTATION OF NEXT GENERATION SEQUENCING IN PRECISION MEDICINE FOR ONCOLOGY

VOZZA, GIANLUCA

2023

Abstract

The identification of clinically relevant variants has been moving through a data-driven path, and a large number of projects have been launched with the aim of favoring the implementation of NGS-based molecular profiling, particularly to define diagnostic, prognostic, and therapeutic pathways for cancer patients. It would be particularly useful to map the country-level mutational prevalence in CPGs in both the affected population and their relatives to assess the degree of heritability of specific mutations predisposing to different types of solid tumors such as breast, ovarian, and colon cancers. Additionally, a comprehensive profiling of the tumor genome of patients is needed to detect variants that may indicate a potential therapeutic response to new drug treatments. In order to do that, action is required within a structured network composed of the main oncology institutions across the country, through collaborative efforts as ACC, and the implementation of a centralized system for the storage, analysis, and interpretation of large volumes of data. For the purpose of centralized analysis and interpretation, we have developed multiple bioinformatic tools to: i) Perform rigorous QC on the reads used in the clinical setting to detect any coverage biases in clinically relevant regions. ii) Perform benchmarking and standardization of variant calling pipelines through the development of a software that simplifies fine-tuning and reduces the time required for the curation of potential FNs. This software should also harmonize the various variant notations in VCF files from major variant callers in a precise and reproducible manner. iii) Interpret and report variants based on ACMG guidelines (2015 for germline variants and 2017 for somatic variants) and develop ML-based algorithms for assessing the pathogenicity of variants with uncertain significance. This work converged into the development of multiple bioinformatic tools: i) Covdetect for detecting coverage biases in relevant genomic regions as where VOIs lie; ii) RecallME to harmonize different variant notation formats and to benchmark VC pipelines on a set of expected variants; iii) MolProBoard to visualize and interpret reported VOIs iv) RENOVO to assess the pathogenicity level of VUS. These tools were then bundled into a single suite for implementing NGS-derived data within the clinical settings, as presented in this thesis.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Oncologia ed Emato-Oncologia
			
	Corso di studio
	
				MEDICINA DEI SISTEMI
			
	Data di pubblicazione
	
				12-dic-2023
			
	Lingua
	
				Inglese
			
	Parola chiave
	
				variant calling; pipeline benchmarking; recallme; bioinformatics; oncology; precision medicine; machine learning; breast cancer; clinical trials; biostatistics
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				MINUCCI, SAVERIO
			
	Nome Editore
	
				Università degli Studi di Milano
			
	Collezione di appartenenza
	
				Università degli Studi di Milano

File in questo prodotto:

File	Dimensione	Formato
phd_unimi_R12747.pdf embargo fino al 20/05/2025 Dimensione 17.61 MB Formato Adobe PDF	17.61 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/112796

Il codice NBN di questa tesi è URN:NBN:IT:UNIMI-112796