Researchers who work on large amount of data have to face vari- ous problems such as data mining and information retrieval: this is the case of gene expression. The general scope of these experiments is to find co-regulated genes, in order to understand the biologic pathways underlying a particular phenomenon. A clustering con- cept can be used to find out if co-regulated genes can be active only over some conditions. Recently, some biclustering approaches have been used to find groups of co-regulated genes into a data matrix. Among them, several heuristic algorithms have been developed to find good solutions in a reasonable running time. In the current Ph.D. thesis, a GRASP-like (Greedy Randomized Adaptive Search Procedure) approach was developed to perform biclustering of microarray data. A new local search has been devel- oped composed of three simple steps based on a concept inspired by the social aggregation of groups. It is very fast and allows to ob- tain results similar to those achieved using some of the best known biclustering algorithms. Other new algorithms have also been pro- posed using novel combinations of iterated local search and MST clustering. The different biclustering algorithms were then tested on four different datasets of gene expression data. Results are encouraging because they are similar or even better to those obtained with the former GRASP-like algorithm. Possible future improvements could be obtained by implementing further combinations of heuristics and testing them onto different datasets in order to evaluate their general application to different kinds of data.

Biclustering of gene expression data: hybridization of GRASP with other heuristic/metaheuristic approaches

2013

Abstract

Researchers who work on large amount of data have to face vari- ous problems such as data mining and information retrieval: this is the case of gene expression. The general scope of these experiments is to find co-regulated genes, in order to understand the biologic pathways underlying a particular phenomenon. A clustering con- cept can be used to find out if co-regulated genes can be active only over some conditions. Recently, some biclustering approaches have been used to find groups of co-regulated genes into a data matrix. Among them, several heuristic algorithms have been developed to find good solutions in a reasonable running time. In the current Ph.D. thesis, a GRASP-like (Greedy Randomized Adaptive Search Procedure) approach was developed to perform biclustering of microarray data. A new local search has been devel- oped composed of three simple steps based on a concept inspired by the social aggregation of groups. It is very fast and allows to ob- tain results similar to those achieved using some of the best known biclustering algorithms. Other new algorithms have also been pro- posed using novel combinations of iterated local search and MST clustering. The different biclustering algorithms were then tested on four different datasets of gene expression data. Results are encouraging because they are similar or even better to those obtained with the former GRASP-like algorithm. Possible future improvements could be obtained by implementing further combinations of heuristics and testing them onto different datasets in order to evaluate their general application to different kinds of data.
2013
en
File in questo prodotto:
File Dimensione Formato  
tesi_PhD_Musacchia.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati
Dimensione 3.41 MB
Formato Adobe PDF
3.41 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/329301
Il codice NBN di questa tesi è URN:NBN:IT:BNCF-329301