In the "ࢠomics" era bioinformatics plays a crucial role in development of new suitable strategies to face different kind of problems attempting to better exploit the different aspects of biology. Moreover, with the upcoming of the Next Generation Sequencing (NGS), the amount of data produced has increased exponentially as the needs of managing the results obtained, with the aim of making these information exploitable for new and deeper analyses. However, all the available resources related to a species are not always unified, updated or integrated, creating confusion and data heterogeneity. In this context, we focused on the currently available resources for some plant genomes. In particular, we considered Arabidopsis thaliana, organism model for plant genomics, and other two species of relevant interest in crop genomics, as well as in the worldwide economy, such as Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We considered all the relevant genomics resources for these plants, to get the current available information concerning genome releases and gene annotation versions. Moreover, we went deep into the tomato genome annotations available, highlighting still present limits being the one considered the first gene annotation release for this recently sequenced genome. In the last part of the work, we extended the analysis also to transcriptomics data. On one hand, we investigated Arabidopsis online resources for co-expression analysis based on microarray approach comparing the source data, the methods and the results currently achievable. On the other hand, due to microarray heterogeneity data for tomato and potato, we preferred to focus on RNA-seq analysis strategies, setting up an appropriate pipeline, tested in a specific analysis on tomato drought stress, and focusing on possible issues arising from a limited annotation as the one from tomato. Our work highlighted the lack of uniformity between reference plant collections, probably caused by multiple different aspects in a multifaceted world like the one of Plant Sciences. Nevertheless, the lack of reliable and uniform references for Plants can lead to misinterpretation of biological data, limiting their use by the scientific community especially in plant comparative genomics.
PLANT "OMICS": ON THE IMPORTANCE OF SUITABLE RESOURCES
2015
Abstract
In the "ࢠomics" era bioinformatics plays a crucial role in development of new suitable strategies to face different kind of problems attempting to better exploit the different aspects of biology. Moreover, with the upcoming of the Next Generation Sequencing (NGS), the amount of data produced has increased exponentially as the needs of managing the results obtained, with the aim of making these information exploitable for new and deeper analyses. However, all the available resources related to a species are not always unified, updated or integrated, creating confusion and data heterogeneity. In this context, we focused on the currently available resources for some plant genomes. In particular, we considered Arabidopsis thaliana, organism model for plant genomics, and other two species of relevant interest in crop genomics, as well as in the worldwide economy, such as Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We considered all the relevant genomics resources for these plants, to get the current available information concerning genome releases and gene annotation versions. Moreover, we went deep into the tomato genome annotations available, highlighting still present limits being the one considered the first gene annotation release for this recently sequenced genome. In the last part of the work, we extended the analysis also to transcriptomics data. On one hand, we investigated Arabidopsis online resources for co-expression analysis based on microarray approach comparing the source data, the methods and the results currently achievable. On the other hand, due to microarray heterogeneity data for tomato and potato, we preferred to focus on RNA-seq analysis strategies, setting up an appropriate pipeline, tested in a specific analysis on tomato drought stress, and focusing on possible issues arising from a limited annotation as the one from tomato. Our work highlighted the lack of uniformity between reference plant collections, probably caused by multiple different aspects in a multifaceted world like the one of Plant Sciences. Nevertheless, the lack of reliable and uniform references for Plants can lead to misinterpretation of biological data, limiting their use by the scientific community especially in plant comparative genomics.| File | Dimensione | Formato | |
|---|---|---|---|
|
tesi_colantuono.pdf
accesso solo da BNCF e BNCR
Tipologia:
Altro materiale allegato
Licenza:
Tutti i diritti riservati
Dimensione
11.92 MB
Formato
Adobe PDF
|
11.92 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/317197
URN:NBN:IT:BNCF-317197