Protein function annotation represents a formidable challenge in modern biology, particularly due to the rapid sequencing of genomes in various species. The process involves assigning specific functions to proteins, especially those newly sequenced and absent from existing biological databases. A comprehensive and accurate database encompassing all proteins and their designated functions holds paramount importance for achieving a nuanced understanding of molecular-level life processes. Such a database would significantly impact research across biomedical, biological, and pharmaceutical domains. Amajor hurdle in creating this exhaustive database stems from the pace at which genomes are sequenced, surpassing the annotation of already-existing genomes. Despite the expeditious generation of genomic data, there remains a notable absence of computational methods that can annotate protein functions with the precision achieved by time-consuming and costly experimental methods. The field of bioinformatics endeavors to bridge this gap by developing increasingly efficient and accurate computational methods capable of managing the vast volume of biological data available and delivering real-time, precise predictions. The advent of Artificial Intelligence (AI) and Machine Learning (ML) has catalyzed substantial progress in predicting protein functions. This work introduces Argot 3.0 as an automated protein function prediction tool. Leveraging advanced machine learning techniques and tailored taxonomic filters, Argot 3.0 demonstrates a remarkable ability to assign functions to proteins with high precision, aligning with the current trajectory of technological advancements. The incorporation of AI and ML not only enhances the speed of predictions but also contributes to the ongoing evolution of computational methods in protein function annotation. Argot 3.0 stands as a testament to the potential of cutting-edge technologies in advancing our understanding of protein functions in the ever-expanding landscape of biological data.

Deep Learning approaches for the functional inference of unknown proteins: Argot 3.0

BIANCA, FEDERICO
2024

Abstract

Protein function annotation represents a formidable challenge in modern biology, particularly due to the rapid sequencing of genomes in various species. The process involves assigning specific functions to proteins, especially those newly sequenced and absent from existing biological databases. A comprehensive and accurate database encompassing all proteins and their designated functions holds paramount importance for achieving a nuanced understanding of molecular-level life processes. Such a database would significantly impact research across biomedical, biological, and pharmaceutical domains. Amajor hurdle in creating this exhaustive database stems from the pace at which genomes are sequenced, surpassing the annotation of already-existing genomes. Despite the expeditious generation of genomic data, there remains a notable absence of computational methods that can annotate protein functions with the precision achieved by time-consuming and costly experimental methods. The field of bioinformatics endeavors to bridge this gap by developing increasingly efficient and accurate computational methods capable of managing the vast volume of biological data available and delivering real-time, precise predictions. The advent of Artificial Intelligence (AI) and Machine Learning (ML) has catalyzed substantial progress in predicting protein functions. This work introduces Argot 3.0 as an automated protein function prediction tool. Leveraging advanced machine learning techniques and tailored taxonomic filters, Argot 3.0 demonstrates a remarkable ability to assign functions to proteins with high precision, aligning with the current trajectory of technological advancements. The incorporation of AI and ML not only enhances the speed of predictions but also contributes to the ongoing evolution of computational methods in protein function annotation. Argot 3.0 stands as a testament to the potential of cutting-edge technologies in advancing our understanding of protein functions in the ever-expanding landscape of biological data.
17-ott-2024
Inglese
TOPPO, STEFANO
Università degli studi di Padova
File in questo prodotto:
File Dimensione Formato  
FBianca_tesi.pdf

accesso aperto

Dimensione 13.23 MB
Formato Adobe PDF
13.23 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/218268
Il codice NBN di questa tesi è URN:NBN:IT:UNIPD-218268