The AID/APOBECs are cytosine deaminases involved in diverse physiological contexts through their ability to edit DNA and RNA. This ability comes with a price: Activation-Induce cytidine Deaminase (AID), the main player in the antibody diversification processes, is responsible for some of the common genetic alterations in mature B-cells tumours; the APOBEC3 subgroup, important actors in virus defence, have been linked to different mutational processes in a number of cancers. Also APOBEC1, an RNA/DNA editing enzyme, also able to restrict lentivirus and mobile elements, can act as a mutator in human cells, and its aberrant expression could be linked to the onset of alterations both at the genomic and transcriptomic level. APOBEC1 RNA editing is a post transcriptional process, its only well-characterized target is the Apolipoprotein B transcript (ApoB) in the small intestine where editing of C6666 induces formation of a stop codon and translation of a truncated protein. Quite different is the situation in Rodents where, thanks to the availability of APOBEC1 knockout mice, hundreds of APOBEC1-dependent editing events, have been discovered beyond ApoB. To date, APOBEC1 deficiencies are not known and in humans and only a few transcripts have been added to the list of targets. Despite the targets known in mice and in humans, we are far from understanding the overall physiological meaning of APOBEC1-induced C to U editing. If we exclude the efforts that have been made to identify and characterise C to U editing in rodents, only few computational approaches have been employed to identify and characterize human APOBEC1 targets. This is the reason why I have used available human RNA-seq data to develop a computational strategy for the identification of APOBEC1 dependent RNA editing events. I used The Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEX) to obtain datasets of samples in which APOBEC1 is expressed at different levels. Using these datasets, I divided samples in high and low expression levels of APOBEC1, and through known tools and ad hoc scripts I built different pipelines to identify positions in the transcript that are differentially edited. The pipeline includes several filters: removal of mapping artefacts, germline and somatic single nucleotide variants, removal of homopolymeric regions and so on. Among the several strategies I used, the most promising are those applied to the GTEX small intestine data, where a strict analysis has shown the presence of at least 12 sites, including 3 known targets on the ApoB mRNA. Surprisingly we found evidence of ApoB editing at canonical sites beyond the small intestine, even in absence of measurable APOBEC1 expression. Considering the possible presence of APOBEC1 outside the gastric tissue, to improve our capacity to identify C to U editing in human tissues, I decided to create a database of C>U edited sites using RNA-seq from APOBEC1 -transfected Hek293T cell lines. This database, despite not representing physiologically edited sites, it informs on all positions biologically editable. Crossing these positions with those obtained from the GTEX dataset results in the identification of hundreds of common edited sites. Finally, I tested the hypothesis that APOBEC1 editing affects the transcript stability in Hek293T cells. Preliminary data suggest that APOBEC1 expression could shift the equilibrium between processed RNA and non-processed RNA towards the latter one. The second part of the thesis centers on the study of RNA-off targets induced by Base editors (BEs). In order to improve the safety of this powerful genome editing tool, another PhD student in the lab, Francesco Donati, selected several APOBEC1 mutant that are not able to edit the RNA while maintaining their mutagenicity on DNA. He investigated both the tumorigenicity of these mutants in mice and their use in genome editing. He obtained exonic and transcriptomic data from murine liver tumors and from cells overexpressing the mutant base editors, respectively. I performed the bioinformatic analyses to explore the mutational signature induced by APOBEC1 in mice, and to assess the off-targets effects on RNA and DNA of these base editors. I demonstrated that -contrarily to wild-type APOBEC1- these mutants provide the ability to perform genome editing in absence of detectable off-targets.

Computational approaches for the identification of APOBEC1- dependent RNA editing events in human tissues

2020

Abstract

The AID/APOBECs are cytosine deaminases involved in diverse physiological contexts through their ability to edit DNA and RNA. This ability comes with a price: Activation-Induce cytidine Deaminase (AID), the main player in the antibody diversification processes, is responsible for some of the common genetic alterations in mature B-cells tumours; the APOBEC3 subgroup, important actors in virus defence, have been linked to different mutational processes in a number of cancers. Also APOBEC1, an RNA/DNA editing enzyme, also able to restrict lentivirus and mobile elements, can act as a mutator in human cells, and its aberrant expression could be linked to the onset of alterations both at the genomic and transcriptomic level. APOBEC1 RNA editing is a post transcriptional process, its only well-characterized target is the Apolipoprotein B transcript (ApoB) in the small intestine where editing of C6666 induces formation of a stop codon and translation of a truncated protein. Quite different is the situation in Rodents where, thanks to the availability of APOBEC1 knockout mice, hundreds of APOBEC1-dependent editing events, have been discovered beyond ApoB. To date, APOBEC1 deficiencies are not known and in humans and only a few transcripts have been added to the list of targets. Despite the targets known in mice and in humans, we are far from understanding the overall physiological meaning of APOBEC1-induced C to U editing. If we exclude the efforts that have been made to identify and characterise C to U editing in rodents, only few computational approaches have been employed to identify and characterize human APOBEC1 targets. This is the reason why I have used available human RNA-seq data to develop a computational strategy for the identification of APOBEC1 dependent RNA editing events. I used The Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEX) to obtain datasets of samples in which APOBEC1 is expressed at different levels. Using these datasets, I divided samples in high and low expression levels of APOBEC1, and through known tools and ad hoc scripts I built different pipelines to identify positions in the transcript that are differentially edited. The pipeline includes several filters: removal of mapping artefacts, germline and somatic single nucleotide variants, removal of homopolymeric regions and so on. Among the several strategies I used, the most promising are those applied to the GTEX small intestine data, where a strict analysis has shown the presence of at least 12 sites, including 3 known targets on the ApoB mRNA. Surprisingly we found evidence of ApoB editing at canonical sites beyond the small intestine, even in absence of measurable APOBEC1 expression. Considering the possible presence of APOBEC1 outside the gastric tissue, to improve our capacity to identify C to U editing in human tissues, I decided to create a database of C>U edited sites using RNA-seq from APOBEC1 -transfected Hek293T cell lines. This database, despite not representing physiologically edited sites, it informs on all positions biologically editable. Crossing these positions with those obtained from the GTEX dataset results in the identification of hundreds of common edited sites. Finally, I tested the hypothesis that APOBEC1 editing affects the transcript stability in Hek293T cells. Preliminary data suggest that APOBEC1 expression could shift the equilibrium between processed RNA and non-processed RNA towards the latter one. The second part of the thesis centers on the study of RNA-off targets induced by Base editors (BEs). In order to improve the safety of this powerful genome editing tool, another PhD student in the lab, Francesco Donati, selected several APOBEC1 mutant that are not able to edit the RNA while maintaining their mutagenicity on DNA. He investigated both the tumorigenicity of these mutants in mice and their use in genome editing. He obtained exonic and transcriptomic data from murine liver tumors and from cells overexpressing the mutant base editors, respectively. I performed the bioinformatic analyses to explore the mutational signature induced by APOBEC1 in mice, and to assess the off-targets effects on RNA and DNA of these base editors. I demonstrated that -contrarily to wild-type APOBEC1- these mutants provide the ability to perform genome editing in absence of detectable off-targets.
2020
Inglese
Conticello Silvestro
Università degli Studi di Siena
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/131553
Il codice NBN di questa tesi è URN:NBN:IT:UNISI-131553