The olive tree (Olea europaea L.) is poorly characterized at genetic and genomic level compared to other fruit tree crops. In the frame of the Italian project OLEA, aimed to obtain the complete sequence of the olive genome, we performed a deep analysis of the repetitive component of this genome, using NGS techniques (454-Roche and Illumina). In a first work, we described different computational procedures for isolating and characterizing olive repeated sequences. These procedures were used to determine the structure of the genome and the composition of its repetitive fraction. Our analyses showed the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, which represent about 31% of the whole genome, more than that reported for any other sequenced plant genome. Tandem repeats are represented by six main families of different length, two of which were firstly discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements, especially LTR-retrotransposons (LTR-REs). Similar procedures were concurrently applied to the genome of another species, the sunflower, and an article on this species is reported as appendix # 2. In a second work, we provided a characterization of 255 unique full-length LTR-REs, identified scanning a number of BAC clone sequences. Copia elements resulted more numerous than Gypsy ones (162 vs. 81), 12 elements were not assigned to any superfamily because lacking of distinctive domains. Mapping a large set of Illumina reads onto the LTR-REs revealed that Gypsy families are made of more members than Copia ones. Four RE families resulted composed especially by solo-LTRs. The insertion time of intact retroelements, measured by sister LTRs divergence, showed that the mean insertion age of the isolated REs is around 18 million years (MY), although some isolated elements inserted relatively recently. Gypsy and Copia REs showed different waves of transposition, with Gypsy elements especially active between 10 and 25 MY and nearly inactive in the last 7 MY. In the third work, using a specific bioinformatic pipeline on olive BAC clone sequences, we identified 418 olive Short Interspersed Nuclear Elements (SINEs), that constitute one of the first SINE collection in a dicotyledonous species. The identified SINEs represent 0.48% of the olive genome and their length ranges from 62 to 588 bp. The vast majority of identified SINEs resulted low or medium redundant, often in association with genic sequences. Analysis of sequence similarity allowed to identify ten major families. Our results demonstrate the suitability of the pipeline employed for SINE identification and will favour further analyses on these relatively unknown elements to be performed in other plant species.

The peculiar structure of the olive (Olea europaea L.) genome as shown by massively parallel sequencing data.

2014

Abstract

The olive tree (Olea europaea L.) is poorly characterized at genetic and genomic level compared to other fruit tree crops. In the frame of the Italian project OLEA, aimed to obtain the complete sequence of the olive genome, we performed a deep analysis of the repetitive component of this genome, using NGS techniques (454-Roche and Illumina). In a first work, we described different computational procedures for isolating and characterizing olive repeated sequences. These procedures were used to determine the structure of the genome and the composition of its repetitive fraction. Our analyses showed the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, which represent about 31% of the whole genome, more than that reported for any other sequenced plant genome. Tandem repeats are represented by six main families of different length, two of which were firstly discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements, especially LTR-retrotransposons (LTR-REs). Similar procedures were concurrently applied to the genome of another species, the sunflower, and an article on this species is reported as appendix # 2. In a second work, we provided a characterization of 255 unique full-length LTR-REs, identified scanning a number of BAC clone sequences. Copia elements resulted more numerous than Gypsy ones (162 vs. 81), 12 elements were not assigned to any superfamily because lacking of distinctive domains. Mapping a large set of Illumina reads onto the LTR-REs revealed that Gypsy families are made of more members than Copia ones. Four RE families resulted composed especially by solo-LTRs. The insertion time of intact retroelements, measured by sister LTRs divergence, showed that the mean insertion age of the isolated REs is around 18 million years (MY), although some isolated elements inserted relatively recently. Gypsy and Copia REs showed different waves of transposition, with Gypsy elements especially active between 10 and 25 MY and nearly inactive in the last 7 MY. In the third work, using a specific bioinformatic pipeline on olive BAC clone sequences, we identified 418 olive Short Interspersed Nuclear Elements (SINEs), that constitute one of the first SINE collection in a dicotyledonous species. The identified SINEs represent 0.48% of the olive genome and their length ranges from 62 to 588 bp. The vast majority of identified SINEs resulted low or medium redundant, often in association with genic sequences. Analysis of sequence similarity allowed to identify ten major families. Our results demonstrate the suitability of the pipeline employed for SINE identification and will favour further analyses on these relatively unknown elements to be performed in other plant species.
7-mar-2014
Italiano
Natali, Lucia
Zuccolo, Andrea
Pistelli, Laura
Martini, Claudia
Lupo, Giuseppe
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
tesi_elena_completa.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 2.71 MB
Formato Adobe PDF
2.71 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/137428
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-137428