Long-read sequencing marked a revolution in the field of transcriptomics by helping to resolve isoform structure, unannotated splicing variants, complex loci and repetitive regions and proposing as a method for RNA modifications detection. This versatility permits its application to different biological problems. For example, we leveraged an adaptation of direct-RNA Nanopore sequencing (dRNA-Seq), named NRCeq, to obtain a comprehensive full-length annotation of the SARS-CoV-2 transcriptome. In parallel, we identified putative pseudouridylation sites on multiple sgRNAs, one of which was located in a well-known regulatory region and proposed to have a role in viral subgenomic RNAs translation. Furthermore, within the FANTOM6 consortium, we used an adaption of the cDNA-PCR Nanopore sequencing protocol, named CFC-seq, to enable the selection for full-length reads and annotate the human non-coding RNAs and eRNAs in neural and monocytic cells. We benchmarked SALA, a custom assembler for CFC-seq data, against assemblers available in literature outlining its intermediate performances between algorithms highly reliant on reference annotation and those primarily depending on input datasets. Finally, integrating multiple sequencing protocols, we have generated a Breast Cancer transcriptomic Panel, to investigate alternative splicing, retrotransposons and RNA modifications on coding and non-coding RNAs in multiple cell lines and organoids. We have proven our capacity to assemble breast cancer cell lines, to retrieve annotated and unannotated protein coding and lncRNAs, to detect m6A sites across different transcript biotypes to identify retrotransposons. Overall, these works provide indication that long-read sequencing is a valuable resource for profiling the transcriptional and epitranscriptional landscape of an organism.
NANOPORE SEQUENCING AS A NEW TOOL TO EXPLORE COMPLEX TRANSCRIPTOMES
UGOLINI, CAMILLA
2025
Abstract
Long-read sequencing marked a revolution in the field of transcriptomics by helping to resolve isoform structure, unannotated splicing variants, complex loci and repetitive regions and proposing as a method for RNA modifications detection. This versatility permits its application to different biological problems. For example, we leveraged an adaptation of direct-RNA Nanopore sequencing (dRNA-Seq), named NRCeq, to obtain a comprehensive full-length annotation of the SARS-CoV-2 transcriptome. In parallel, we identified putative pseudouridylation sites on multiple sgRNAs, one of which was located in a well-known regulatory region and proposed to have a role in viral subgenomic RNAs translation. Furthermore, within the FANTOM6 consortium, we used an adaption of the cDNA-PCR Nanopore sequencing protocol, named CFC-seq, to enable the selection for full-length reads and annotate the human non-coding RNAs and eRNAs in neural and monocytic cells. We benchmarked SALA, a custom assembler for CFC-seq data, against assemblers available in literature outlining its intermediate performances between algorithms highly reliant on reference annotation and those primarily depending on input datasets. Finally, integrating multiple sequencing protocols, we have generated a Breast Cancer transcriptomic Panel, to investigate alternative splicing, retrotransposons and RNA modifications on coding and non-coding RNAs in multiple cell lines and organoids. We have proven our capacity to assemble breast cancer cell lines, to retrieve annotated and unannotated protein coding and lncRNAs, to detect m6A sites across different transcript biotypes to identify retrotransposons. Overall, these works provide indication that long-read sequencing is a valuable resource for profiling the transcriptional and epitranscriptional landscape of an organism.File | Dimensione | Formato | |
---|---|---|---|
phd_unimi_R13144.pdf
embargo fino al 19/06/2026
Dimensione
41.97 MB
Formato
Adobe PDF
|
41.97 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/189842
URN:NBN:IT:UNIMI-189842