A DISTRIBUTED ENVIRONMENT FOR HIGH THROUGHPUT SEQUENCING AND OTHER BIOINFORMATIC DATA ANALYSIS

Zanfardino, Mario

The fast growth of data produced by modern biomedical studies and the related increasing demand for bioinformatic data processing, make it necessary to optimally use the available hardware and software resources. This growth is such that, despite the fast growth in processor and storage ability, single local computers are unable to keep up with the data and current advances in genome sequencing technology will produce genome-sequence data more quickly than the speed and rate at which processors are getting faster. With the architectures described in the introduction, great progress is being made in biological data processing and today they are commonly used in order to solve computationally intensive problems by sophisticated bioinformatic tools and pipelines, but there is still need for better interfaces able to integrate tools of different origin and to upload processing to remote servers. In this work, a platform has been designed, to integrate applications of different type and to isolates services and databases from one another by providing a middle service layer which has the potential to have significant advantages in the development of bioinformatic data processing system. In particular, an Enterprise service bus architecture has been considered as a possible platform around which a full data processing system might be centred. This type of architecture is intrinsically able to integrate different service resources through a bus-like infrastructure and provides significant features such as lightweight interfacing and easy expansion.

A DISTRIBUTED ENVIRONMENT FOR HIGH THROUGHPUT SEQUENCING AND OTHER BIOINFORMATIC DATA ANALYSIS

Zanfardino, Mario

2016

Abstract

The fast growth of data produced by modern biomedical studies and the related increasing demand for bioinformatic data processing, make it necessary to optimally use the available hardware and software resources. This growth is such that, despite the fast growth in processor and storage ability, single local computers are unable to keep up with the data and current advances in genome sequencing technology will produce genome-sequence data more quickly than the speed and rate at which processors are getting faster. With the architectures described in the introduction, great progress is being made in biological data processing and today they are commonly used in order to solve computationally intensive problems by sophisticated bioinformatic tools and pipelines, but there is still need for better interfaces able to integrate tools of different origin and to upload processing to remote servers. In this work, a platform has been designed, to integrate applications of different type and to isolates services and databases from one another by providing a middle service layer which has the potential to have significant advantages in the development of bioinformatic data processing system. In particular, an Enterprise service bus architecture has been considered as a possible platform around which a full data processing system might be centred. This type of architecture is intrinsically able to integrate different service resources through a bus-like infrastructure and provides significant features such as lightweight interfacing and easy expansion.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2016
			
	Lingua
	
				it
			
	Collezione di appartenenza
	
				BNCF

File in questo prodotto:

File	Dimensione	Formato
tesi_MZanfardino_v18%202.pdf accesso solo da BNCF e BNCR Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati Dimensione 955.98 kB Formato Adobe PDF	955.98 kB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/324575

Il codice NBN di questa tesi è URN:NBN:IT:BNCF-324575