This thesis presents my contributions to the field of intrinsically disordered proteins (IDPs) and tandem repeat proteins (TRPs), focusing on resources development and data analysis needed in order to understand their functional aspects. In particular, the first part highlights the importance of visualization tools in understanding the structural and functional aspects of IDPs and TRPs. Visualization tools such as ProseqViewer and FeatureViewer, enable the representation of disorder regions and repeat motifs and also other protein features, such as mutations or the secondary structure. These visualization tools, along with other established methods, play a crucial role in interpreting experimental data, validating computational predictions, and communicating complex protein structures to the scientific community and wider audience. The second part of the study delves into the importance of databases for non-globular proteins (NGPs) to facilitate access to updated information on their sequences, structures, and interactions. Notably, the improvements in functionality, content, and style of DisProt, MobiDB and RepeatsDB databases have significantly enhanced our understanding of NGPs functions and their experimental evidence. The advent of cutting-edge prediction methods which we evaluated in the Critical Assessment of Protein Intrinsic Disorder Prediction (CAID), including the impressive impact of AlphaFold, has further emphasized the prevalence of disorder regions and underscored the significance of IDPs in proteomes. Building upon this foundation, future endeavors in the IDPs field strive to expand the integration, generation, and standardization of disorder knowledge, ensuring continuous advancements in our comprehension of this intriguing class of proteins. Finally, the third part of the thesis focuses on TRPs, specifically through the analysis of the RepeatsDB database and the exploration of TRPs functions. RepeatsDB has evolved as a central resource for the characterization and classification of TRPs, providing a benchmark for repeat detection algorithms. The recent release of RepeatsDB introduces novel data visualization techniques and an extended classification schema, facilitating the annotation and comparative analysis of TRPs from different sources. I evaluate RepeatsDB-lite2 predictor on RepeatsDB curated data demonstating an improvement from the previous tool RepeatsDB-lite2, and I then use the predictor to analyse TRPs on a proteome-wide scale. Moreover, I investigate TRPs role as protein scaffolds and provide examples to discuss why they specialized as such. In addition, I test the hypothesis of TPRs being probable disease candidates and I consider some of the parameters that could explain the reasons behind this phenomenon: intrinsic factors such as sequence length, radius of gyration and their role as hubs in the protein-protein interaction network. The findings highlight the importance of TRPs recognition, classification and study for a better understanding of the cell machinery and disease emergence, as well as opening up a promising field in biomolecular engineering.
CARATTERIZZAZIONE DEGLI ASPETTI FUNZIONALI DEI DOMINI NON GLOBULARI
BEVILACQUA, MARTINA
2024
Abstract
This thesis presents my contributions to the field of intrinsically disordered proteins (IDPs) and tandem repeat proteins (TRPs), focusing on resources development and data analysis needed in order to understand their functional aspects. In particular, the first part highlights the importance of visualization tools in understanding the structural and functional aspects of IDPs and TRPs. Visualization tools such as ProseqViewer and FeatureViewer, enable the representation of disorder regions and repeat motifs and also other protein features, such as mutations or the secondary structure. These visualization tools, along with other established methods, play a crucial role in interpreting experimental data, validating computational predictions, and communicating complex protein structures to the scientific community and wider audience. The second part of the study delves into the importance of databases for non-globular proteins (NGPs) to facilitate access to updated information on their sequences, structures, and interactions. Notably, the improvements in functionality, content, and style of DisProt, MobiDB and RepeatsDB databases have significantly enhanced our understanding of NGPs functions and their experimental evidence. The advent of cutting-edge prediction methods which we evaluated in the Critical Assessment of Protein Intrinsic Disorder Prediction (CAID), including the impressive impact of AlphaFold, has further emphasized the prevalence of disorder regions and underscored the significance of IDPs in proteomes. Building upon this foundation, future endeavors in the IDPs field strive to expand the integration, generation, and standardization of disorder knowledge, ensuring continuous advancements in our comprehension of this intriguing class of proteins. Finally, the third part of the thesis focuses on TRPs, specifically through the analysis of the RepeatsDB database and the exploration of TRPs functions. RepeatsDB has evolved as a central resource for the characterization and classification of TRPs, providing a benchmark for repeat detection algorithms. The recent release of RepeatsDB introduces novel data visualization techniques and an extended classification schema, facilitating the annotation and comparative analysis of TRPs from different sources. I evaluate RepeatsDB-lite2 predictor on RepeatsDB curated data demonstating an improvement from the previous tool RepeatsDB-lite2, and I then use the predictor to analyse TRPs on a proteome-wide scale. Moreover, I investigate TRPs role as protein scaffolds and provide examples to discuss why they specialized as such. In addition, I test the hypothesis of TPRs being probable disease candidates and I consider some of the parameters that could explain the reasons behind this phenomenon: intrinsic factors such as sequence length, radius of gyration and their role as hubs in the protein-protein interaction network. The findings highlight the importance of TRPs recognition, classification and study for a better understanding of the cell machinery and disease emergence, as well as opening up a promising field in biomolecular engineering.File | Dimensione | Formato | |
---|---|---|---|
tesi_Martina_Bevilacqua.pdf
accesso aperto
Dimensione
7.04 MB
Formato
Adobe PDF
|
7.04 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/218364
URN:NBN:IT:UNIPD-218364