Of the ~1.7 million SINE elements in the human genome, only a tiny number are estimated to be active in transcription by RNA polymerase (Pol) III. Tracing the individual loci from which SINE transcripts originate is complicated by their highly repetitive nature. By exploiting RNA-Seq datasets and unique SINE DNA sequences, we devised a bioinformatic pipeline allowing us to identify Pol III-dependent transcripts of individual SINE elements. When applied to ENCODE transcriptomes of seven human cell lines, this search strategy identified ~1300 Alu loci and ~1100 MIR loci corresponding to detectable transcripts, with ~120 and ~60 respectively Alu and MIR loci expressed in at least three cell lines. In vitro transcription of selected SINEs did not reflect their in vivo expression properties, and required the native 5’-flanking region in addition to internal promoter. We also identified a cluster of expressed AluYa5-derived transcription units, juxtaposed to snaR genes on chromosome 19, formed by a promoter-containing left monomer fused to an Alu-unrelated downstream moiety. Autonomous Pol III transcription was also revealed for SINEs nested within Pol II-transcribed genes raising the possibility of an underlying mechanism for Pol II gene regulation by SINE transcriptional units. Moreover the application of our bioinformatic pipeline to both RNA-seq data of cells subjected to an in vitro pro-oncogenic stimulus and of in vivo matched tumor and non-tumor samples allowed us to detect increased Alu RNA expression as well as the source loci of such deregulation. The ability to investigate SINE transcriptomes at single-locus resolution will facilitate both the identification of novel biologically relevant SINE RNAs and the assessment of SINE expression alteration under pathological conditions.

Retrotransposon expression profiling: Unveiling the hidden SINE transcriptome through Next-Generation Sequencing data analysis

2016

Abstract

Of the ~1.7 million SINE elements in the human genome, only a tiny number are estimated to be active in transcription by RNA polymerase (Pol) III. Tracing the individual loci from which SINE transcripts originate is complicated by their highly repetitive nature. By exploiting RNA-Seq datasets and unique SINE DNA sequences, we devised a bioinformatic pipeline allowing us to identify Pol III-dependent transcripts of individual SINE elements. When applied to ENCODE transcriptomes of seven human cell lines, this search strategy identified ~1300 Alu loci and ~1100 MIR loci corresponding to detectable transcripts, with ~120 and ~60 respectively Alu and MIR loci expressed in at least three cell lines. In vitro transcription of selected SINEs did not reflect their in vivo expression properties, and required the native 5’-flanking region in addition to internal promoter. We also identified a cluster of expressed AluYa5-derived transcription units, juxtaposed to snaR genes on chromosome 19, formed by a promoter-containing left monomer fused to an Alu-unrelated downstream moiety. Autonomous Pol III transcription was also revealed for SINEs nested within Pol II-transcribed genes raising the possibility of an underlying mechanism for Pol II gene regulation by SINE transcriptional units. Moreover the application of our bioinformatic pipeline to both RNA-seq data of cells subjected to an in vitro pro-oncogenic stimulus and of in vivo matched tumor and non-tumor samples allowed us to detect increased Alu RNA expression as well as the source loci of such deregulation. The ability to investigate SINE transcriptomes at single-locus resolution will facilitate both the identification of novel biologically relevant SINE RNAs and the assessment of SINE expression alteration under pathological conditions.
18-mar-2016
Inglese
SINE
Alu
MIR
RNA polymerase III
ncRNA
RNA-seq
bioinformatic
biomarker
Dieci, Giorgio
Università degli Studi di Parma
File in questo prodotto:
File Dimensione Formato  
Tesi.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Dimensione 17.35 MB
Formato Adobe PDF
17.35 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/149865
Il codice NBN di questa tesi è URN:NBN:IT:UNIPR-149865