Statistical modeling and simulation for differential expression analysis of sequenced databases

Polisety, Aneesha

The development of single-cell RNA-sequencing (scRNA-seq) has made it feasible to quantify the dynamics of gene expression at the single-cell level. A significant amount of expression data for thousands or even millions of genes across millions of cells can be produced in a single experiment using scRNA-seq. The main downstream analysis of this type of data to find gene markers for cell type detection and to supply inputs for other secondary investigations is differential expression analysis. The literature has reported on a wide range of statistical techniques for differential expression analysis. This dissertation presents new statistical methods for differential expression analysis and proposes algorithms to improve computational efficiency of resampling-based test methods in genomic studies. The main contribution is the proposition of three models for optimizing and fitting complex statistical models, which could be applied to genomic data and other types of data. The gene-level differential score is introduced for joint testing differential expression and differential splicing of genes, improving statistical power. The two-part mixed model is applied to single-cell gene expression data and the automatic differentiation technique is used for efficient fitting. The present work proposes a user-friendly software package for differential expression analysis with single-cell gene expression data and the adaptive CE method for evaluating small p-values from permutation tests.

Statistical modeling and simulation for differential expression analysis of sequenced databases

POLISETY, ANEESHA

2023

Abstract

The development of single-cell RNA-sequencing (scRNA-seq) has made it feasible to quantify the dynamics of gene expression at the single-cell level. A significant amount of expression data for thousands or even millions of genes across millions of cells can be produced in a single experiment using scRNA-seq. The main downstream analysis of this type of data to find gene markers for cell type detection and to supply inputs for other secondary investigations is differential expression analysis. The literature has reported on a wide range of statistical techniques for differential expression analysis. This dissertation presents new statistical methods for differential expression analysis and proposes algorithms to improve computational efficiency of resampling-based test methods in genomic studies. The main contribution is the proposition of three models for optimizing and fitting complex statistical models, which could be applied to genomic data and other types of data. The gene-level differential score is introduced for joint testing differential expression and differential splicing of genes, improving statistical power. The two-part mixed model is applied to single-cell gene expression data and the automatic differentiation technique is used for efficient fitting. The present work proposes a user-friendly software package for differential expression analysis with single-cell gene expression data and the adaptive CE method for evaluating small p-values from permutation tests.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Ingegneria industriale
			
	Data di pubblicazione
	
				2023
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				STADERINI, ENRICO MARIA
			
	Nome Editore
	
				Università degli Studi di Roma "Tor Vergata"
			
	Collezione di appartenenza
	
				Università degli Studi di Roma Tor Vergata

File in questo prodotto:

File	Dimensione	Formato
Aneesha Polisety_7.pdf accesso solo da BNCF e BNCR Dimensione 4.61 MB Formato Adobe PDF	4.61 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/218794

Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA2-218794