Protein secondary structure prediction: novel methods and software architectures

Ledda, Filippo Giuseppe

Owing to the strict relationship between protein structure and function, the prediction of protein tertiary structure has become one of the most important tasks in recent years. Despite recent advances, building the complete protein tertiary structure is still not a tractable task in most cases; in the absence of a clear homology relationship the problem is often decomposed into smaller sub tasks, including the prediction of the secondary structure. Notwithstanding the large variety of dierent strategies proposed over the years, secondary structure prediction is still an open problem, and few advances in the field have been made in recent times. In this thesis, the problem of secondary structure prediction is firstly analyzed, identifying five different information sources related to the biological essence of the problem, in order be exploited in a learning system. After describing a general software architecture and framework aimed at dealing with the issues related to the engineering and set up of prediction systems applied to real-world problems, dierent techniques based on the encoding and decoding of biological information, together with custom software architectures, are presented. The different proposals are assessed experimentally. The best improvements are consistent with the recent advances in the field (about 1-2% in the last ten years), conforming the validity of the assumption that the correlation sources identified can be further exploited to improve predictions.

Protein secondary structure prediction: novel methods and software architectures

Ledda, Filippo Giuseppe

2011

Abstract

Owing to the strict relationship between protein structure and function, the prediction of protein tertiary structure has become one of the most important tasks in recent years. Despite recent advances, building the complete protein tertiary structure is still not a tractable task in most cases; in the absence of a clear homology relationship the problem is often decomposed into smaller sub tasks, including the prediction of the secondary structure. Notwithstanding the large variety of dierent strategies proposed over the years, secondary structure prediction is still an open problem, and few advances in the field have been made in recent times. In this thesis, the problem of secondary structure prediction is firstly analyzed, identifying five different information sources related to the biological essence of the problem, in order be exploited in a learning system. After describing a general software architecture and framework aimed at dealing with the issues related to the engineering and set up of prediction systems applied to real-world problems, dierent techniques based on the encoding and decoding of biological information, together with custom software architectures, are presented. The different proposals are assessed experimentally. The best improvements are consistent with the recent advances in the field (about 1-2% in the last ten years), conforming the validity of the assumption that the correlation sources identified can be further exploited to improve predictions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2011
			
	Lingua
	
				it
			
	Collezione di appartenenza
	
				BNCF

File in questo prodotto:

File	Dimensione	Formato
PhD_Filippo_G_Ledda.pdf accesso solo da BNCF e BNCR Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati Dimensione 6.93 MB Formato Adobe PDF	6.93 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/314172

Il codice NBN di questa tesi è URN:NBN:IT:BNCF-314172