BIOINFORMATIC TOOLS FOR NEXT GENERATION GENOMICS

Chiara, Matteo

New sequencing strategies have redefined the concept of “high-throughput sequencing” and many companies, researchers, and recent reviews use the term “Next-Generation Sequencing” (NGS) instead of high-throughput sequencing. These advances have introduced a new era in genomics and bioinformatics⁠⁠. During my years as PhD student I have developed various software, algorithms and procedures for the analysis of Nest Generation sequencing data required for distinct biological research projects and collaborations in which our research group was involved. The tools and algorithms are thus presented in their appropriate biological contexts. Initially I dedicated myself to the development of scripts and pipelines which were used to assemble and annotate the mitochondrial genome of the model plant Vitis vinifera. The sequence was subsequently used as a reference to study the RNA editing of mitochondrial transcripts, using data produced by the Illumina and SOLiD platforms. I subsequently developed a new approach and a new software package for the detection of of relatively small indels between a donor and a reference genome, using NGS paired-end (PE) data and machine learning algorithms. I was able to show that, suitable Paired End data, contrary to previous assertions, can be used to detect, with high confidence, very small indels in low complexity genomic contexts. Finally I participated in a project aimed at the reconstruction of the genomic sequences of 2 distinct strains of the biotechnologically relevant fungus Fusarium. In this context I performed the sequence assembly to obtain the initial contigs and devised and implemented a new scaffolding algorithm which has proved to be particularly efficient.

BIOINFORMATIC TOOLS FOR NEXT GENERATION GENOMICS

CHIARA, MATTEO

2012

Abstract

New sequencing strategies have redefined the concept of “high-throughput sequencing” and many companies, researchers, and recent reviews use the term “Next-Generation Sequencing” (NGS) instead of high-throughput sequencing. These advances have introduced a new era in genomics and bioinformatics⁠⁠. During my years as PhD student I have developed various software, algorithms and procedures for the analysis of Nest Generation sequencing data required for distinct biological research projects and collaborations in which our research group was involved. The tools and algorithms are thus presented in their appropriate biological contexts. Initially I dedicated myself to the development of scripts and pipelines which were used to assemble and annotate the mitochondrial genome of the model plant Vitis vinifera. The sequence was subsequently used as a reference to study the RNA editing of mitochondrial transcripts, using data produced by the Illumina and SOLiD platforms. I subsequently developed a new approach and a new software package for the detection of of relatively small indels between a donor and a reference genome, using NGS paired-end (PE) data and machine learning algorithms. I was able to show that, suitable Paired End data, contrary to previous assertions, can be used to detect, with high confidence, very small indels in low complexity genomic contexts. Finally I participated in a project aimed at the reconstruction of the genomic sequences of 2 distinct strains of the biotechnologically relevant fungus Fusarium. In this context I performed the sequence assembly to obtain the initial contigs and devised and implemented a new scaffolding algorithm which has proved to be particularly efficient.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				SCIENZE GENETICHE E BIOMOLECOLARI
			
	Data di pubblicazione
	
				20-apr-2012
			
	Lingua
	
				Inglese
			
	Parola chiave
	
				bioinformatics ; comparative genomics ; genome assembly ; scaffolding ; structural variations;
			
	Relatore, Supervisor, Advisor o Tutor
	
				HORNER, DAVID STEPHEN
			
	Nome Editore
	
				Università degli Studi di Milano
			
	Collezione di appartenenza
	
				Università degli Studi di Milano

File in questo prodotto:

File	Dimensione	Formato
phd_unimi_R08171.pdf Open Access dal 29/03/2013 Licenza: Tutti i diritti riservati Dimensione 4.83 MB Formato Adobe PDF Visualizza/Apri	4.83 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/101822

Il codice NBN di questa tesi è URN:NBN:IT:UNIMI-101822