The advances in genomic technology in the last years led to a dramatic reduction in the costs of producing and analysing genomic data. The variety of biological features, for which genomic data have been produced, leads to the unprecedented possibility to analyse the relationships between these features at a genome-wide level for an ever-increasing number of samples. Specialised bioinformatics approaches are thus needed to approach this enormous amount of data and to investigate biological problems in a focused manner. The aim of this work was to develop techniques to analyse genomic signals related to the transcriptional process in human, by taking advantage of the current availability of public datasets. A series of statistical/bioinformatics procedures to extract and analyse several aspects of the distribution of genomic signals in a set of genomic loci of interest were designed. These techniques were applied to investigate the structural variation around human transcription start sites and human polyadenylation sites as well as the influence of evolutionary and non-evolutionary forces in shaping this variability. Several genomic structural signals, from single nucleotide polymorphisms to nucleosomes occupation and several genomic related characteristics (conservation, biased gene conversion and pathogenicity) were analysed. The overall results showed that variants tend to distribute not at random manner around these two classes of sites. The way in which variants distribute seems to be related to their frequency. Moreover, the comparison of these distributions with other genomic signals suggests that sequence composition, chromatin structure and natural selection play an active role on the insurgence and the maintenance of such diversity. The presented results need of course to be extended and experimentally validated, but represent a step forward in our understanding of how genetic variability arise and how it is maintained in such critical DNA regions.

Transcription Related Genetic Variation in Human Genome

2015

Abstract

The advances in genomic technology in the last years led to a dramatic reduction in the costs of producing and analysing genomic data. The variety of biological features, for which genomic data have been produced, leads to the unprecedented possibility to analyse the relationships between these features at a genome-wide level for an ever-increasing number of samples. Specialised bioinformatics approaches are thus needed to approach this enormous amount of data and to investigate biological problems in a focused manner. The aim of this work was to develop techniques to analyse genomic signals related to the transcriptional process in human, by taking advantage of the current availability of public datasets. A series of statistical/bioinformatics procedures to extract and analyse several aspects of the distribution of genomic signals in a set of genomic loci of interest were designed. These techniques were applied to investigate the structural variation around human transcription start sites and human polyadenylation sites as well as the influence of evolutionary and non-evolutionary forces in shaping this variability. Several genomic structural signals, from single nucleotide polymorphisms to nucleosomes occupation and several genomic related characteristics (conservation, biased gene conversion and pathogenicity) were analysed. The overall results showed that variants tend to distribute not at random manner around these two classes of sites. The way in which variants distribute seems to be related to their frequency. Moreover, the comparison of these distributions with other genomic signals suggests that sequence composition, chromatin structure and natural selection play an active role on the insurgence and the maintenance of such diversity. The presented results need of course to be extended and experimentally validated, but represent a step forward in our understanding of how genetic variability arise and how it is maintained in such critical DNA regions.
2015
it
File in questo prodotto:
File Dimensione Formato  
scala_giovanni_27.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati
Dimensione 9.29 MB
Formato Adobe PDF
9.29 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/317354
Il codice NBN di questa tesi è URN:NBN:IT:BNCF-317354