The identification of clinically relevant variants has been moving through a data-driven path, and a large number of projects have been launched with the aim of favoring the implementation of NGS-based molecular profiling, particularly to define diagnostic, prognostic, and therapeutic pathways for cancer patients. It would be particularly useful to map the country-level mutational prevalence in CPGs in both the affected population and their relatives to assess the degree of heritability of specific mutations predisposing to different types of solid tumors such as breast, ovarian, and colon cancers. Additionally, a comprehensive profiling of the tumor genome of patients is needed to detect variants that may indicate a potential therapeutic response to new drug treatments. In order to do that, action is required within a structured network composed of the main oncology institutions across the country, through collaborative efforts as ACC, and the implementation of a centralized system for the storage, analysis, and interpretation of large volumes of data. For the purpose of centralized analysis and interpretation, we have developed multiple bioinformatic tools to: i) Perform rigorous QC on the reads used in the clinical setting to detect any coverage biases in clinically relevant regions. ii) Perform benchmarking and standardization of variant calling pipelines through the development of a software that simplifies fine-tuning and reduces the time required for the curation of potential FNs. This software should also harmonize the various variant notations in VCF files from major variant callers in a precise and reproducible manner. iii) Interpret and report variants based on ACMG guidelines (2015 for germline variants and 2017 for somatic variants) and develop ML-based algorithms for assessing the pathogenicity of variants with uncertain significance. This work converged into the development of multiple bioinformatic tools: i) Covdetect for detecting coverage biases in relevant genomic regions as where VOIs lie; ii) RecallME to harmonize different variant notation formats and to benchmark VC pipelines on a set of expected variants; iii) MolProBoard to visualize and interpret reported VOIs iv) RENOVO to assess the pathogenicity level of VUS. These tools were then bundled into a single suite for implementing NGS-derived data within the clinical settings, as presented in this thesis.
DEVELOPMENT OF A BIOINFORMATIC SUITE TO IMPROVE THE CLINICAL IMPLEMENTATION OF NEXT GENERATION SEQUENCING IN PRECISION MEDICINE FOR ONCOLOGY
VOZZA, GIANLUCA
2023
Abstract
The identification of clinically relevant variants has been moving through a data-driven path, and a large number of projects have been launched with the aim of favoring the implementation of NGS-based molecular profiling, particularly to define diagnostic, prognostic, and therapeutic pathways for cancer patients. It would be particularly useful to map the country-level mutational prevalence in CPGs in both the affected population and their relatives to assess the degree of heritability of specific mutations predisposing to different types of solid tumors such as breast, ovarian, and colon cancers. Additionally, a comprehensive profiling of the tumor genome of patients is needed to detect variants that may indicate a potential therapeutic response to new drug treatments. In order to do that, action is required within a structured network composed of the main oncology institutions across the country, through collaborative efforts as ACC, and the implementation of a centralized system for the storage, analysis, and interpretation of large volumes of data. For the purpose of centralized analysis and interpretation, we have developed multiple bioinformatic tools to: i) Perform rigorous QC on the reads used in the clinical setting to detect any coverage biases in clinically relevant regions. ii) Perform benchmarking and standardization of variant calling pipelines through the development of a software that simplifies fine-tuning and reduces the time required for the curation of potential FNs. This software should also harmonize the various variant notations in VCF files from major variant callers in a precise and reproducible manner. iii) Interpret and report variants based on ACMG guidelines (2015 for germline variants and 2017 for somatic variants) and develop ML-based algorithms for assessing the pathogenicity of variants with uncertain significance. This work converged into the development of multiple bioinformatic tools: i) Covdetect for detecting coverage biases in relevant genomic regions as where VOIs lie; ii) RecallME to harmonize different variant notation formats and to benchmark VC pipelines on a set of expected variants; iii) MolProBoard to visualize and interpret reported VOIs iv) RENOVO to assess the pathogenicity level of VUS. These tools were then bundled into a single suite for implementing NGS-derived data within the clinical settings, as presented in this thesis.File | Dimensione | Formato | |
---|---|---|---|
phd_unimi_R12747.pdf
embargo fino al 20/05/2025
Dimensione
17.61 MB
Formato
Adobe PDF
|
17.61 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/112796
URN:NBN:IT:UNIMI-112796