In the "ࢠomics" era bioinformatics plays a crucial role in development of new suitable strategies to face different kind of problems attempting to better exploit the different aspects of biology. Moreover, with the upcoming of the Next Generation Sequencing (NGS), the amount of data produced has increased exponentially as the needs of managing the results obtained, with the aim of making these information exploitable for new and deeper analyses. However, all the available resources related to a species are not always unified, updated or integrated, creating confusion and data heterogeneity. In this context, we focused on the currently available resources for some plant genomes. In particular, we considered Arabidopsis thaliana, organism model for plant genomics, and other two species of relevant interest in crop genomics, as well as in the worldwide economy, such as Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We considered all the relevant genomics resources for these plants, to get the current available information concerning genome releases and gene annotation versions. Moreover, we went deep into the tomato genome annotations available, highlighting still present limits being the one considered the first gene annotation release for this recently sequenced genome. In the last part of the work, we extended the analysis also to transcriptomics data. On one hand, we investigated Arabidopsis online resources for co-expression analysis based on microarray approach comparing the source data, the methods and the results currently achievable. On the other hand, due to microarray heterogeneity data for tomato and potato, we preferred to focus on RNA-seq analysis strategies, setting up an appropriate pipeline, tested in a specific analysis on tomato drought stress, and focusing on possible issues arising from a limited annotation as the one from tomato. Our work highlighted the lack of uniformity between reference plant collections, probably caused by multiple different aspects in a multifaceted world like the one of Plant Sciences. Nevertheless, the lack of reliable and uniform references for Plants can lead to misinterpretation of biological data, limiting their use by the scientific community especially in plant comparative genomics.

PLANT "OMICS": ON THE IMPORTANCE OF SUITABLE RESOURCES

2015

Abstract

In the "ࢠomics" era bioinformatics plays a crucial role in development of new suitable strategies to face different kind of problems attempting to better exploit the different aspects of biology. Moreover, with the upcoming of the Next Generation Sequencing (NGS), the amount of data produced has increased exponentially as the needs of managing the results obtained, with the aim of making these information exploitable for new and deeper analyses. However, all the available resources related to a species are not always unified, updated or integrated, creating confusion and data heterogeneity. In this context, we focused on the currently available resources for some plant genomes. In particular, we considered Arabidopsis thaliana, organism model for plant genomics, and other two species of relevant interest in crop genomics, as well as in the worldwide economy, such as Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We considered all the relevant genomics resources for these plants, to get the current available information concerning genome releases and gene annotation versions. Moreover, we went deep into the tomato genome annotations available, highlighting still present limits being the one considered the first gene annotation release for this recently sequenced genome. In the last part of the work, we extended the analysis also to transcriptomics data. On one hand, we investigated Arabidopsis online resources for co-expression analysis based on microarray approach comparing the source data, the methods and the results currently achievable. On the other hand, due to microarray heterogeneity data for tomato and potato, we preferred to focus on RNA-seq analysis strategies, setting up an appropriate pipeline, tested in a specific analysis on tomato drought stress, and focusing on possible issues arising from a limited annotation as the one from tomato. Our work highlighted the lack of uniformity between reference plant collections, probably caused by multiple different aspects in a multifaceted world like the one of Plant Sciences. Nevertheless, the lack of reliable and uniform references for Plants can lead to misinterpretation of biological data, limiting their use by the scientific community especially in plant comparative genomics.
2015
it
File in questo prodotto:
File Dimensione Formato  
tesi_colantuono.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati
Dimensione 11.92 MB
Formato Adobe PDF
11.92 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/317197
Il codice NBN di questa tesi è URN:NBN:IT:BNCF-317197