Metodi Bayesiani per strutture di dipendenza complesse con applicazione all'ecologia

Stolf, Federica

Recent advancements in species sampling methods and bioinformatic tools have led to the collection of increasingly complex species occurrence datasets, motivating the development of statistical tools to extract valuable insights and enhance our understanding of biodiversity. There is a rich literature in ecology on so-called joint species distribution models, which usually take the form of multivariate probit latent factor regression models. However, they present fundamental problems in handling high-dimensional species co-occurrence data with many rare species: (i) these models cannot deal with the fact that we regularly discover many new species as the sampling is being conducted; (ii) they do not provide specific models for array data but just flatten the data into a matrix, losing structural information. Motivated by ecology applications, this thesis introduces novel Bayesian methods to model multivariate binary data with a growing number of outcomes and multiway data. The thesis is organized into two main threads. The first develops a new class of dependent infinite latent feature models, proposing a general framework that bridges between multivariate probit models and the Indian buffet process, the most popular method in infinite latent feature models literature. The second framework addresses array data modeling, by introducing a Bayesian tensor decomposition model that adaptively selects the unknown rank of the decomposition through a suitable shrinkage prior. In both threads, the theoretical properties of the proposed methods are extensively studied, and efficient algorithms for posterior computation are discussed. The performance of the proposed approaches is assessed in simulation studies and complex ecological applications.

Metodi Bayesiani per strutture di dipendenza complesse con applicazione all'ecologia

STOLF, FEDERICA

2025

Abstract

Recent advancements in species sampling methods and bioinformatic tools have led to the collection of increasingly complex species occurrence datasets, motivating the development of statistical tools to extract valuable insights and enhance our understanding of biodiversity. There is a rich literature in ecology on so-called joint species distribution models, which usually take the form of multivariate probit latent factor regression models. However, they present fundamental problems in handling high-dimensional species co-occurrence data with many rare species: (i) these models cannot deal with the fact that we regularly discover many new species as the sampling is being conducted; (ii) they do not provide specific models for array data but just flatten the data into a matrix, losing structural information. Motivated by ecology applications, this thesis introduces novel Bayesian methods to model multivariate binary data with a growing number of outcomes and multiway data. The thesis is organized into two main threads. The first develops a new class of dependent infinite latent feature models, proposing a general framework that bridges between multivariate probit models and the Indian buffet process, the most popular method in infinite latent feature models literature. The second framework addresses array data modeling, by introducing a Bayesian tensor decomposition model that adaptively selects the unknown rank of the decomposition through a suitable shrinkage prior. In both threads, the theoretical properties of the proposed methods are extensively studied, and efficient algorithms for posterior computation are discussed. The performance of the proposed approaches is assessed in simulation studies and complex ecological applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				SCIENZE STATISTICHE
			
	Data di pubblicazione
	
				21-gen-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				CANALE, ANTONIO
			
	Nome Editore
	
				Università degli studi di Padova
			
	Collezione di appartenenza
	
				Università degli Studi di Padova

File in questo prodotto:

File	Dimensione	Formato
PhDThesisStolf.pdf accesso aperto Dimensione 4.44 MB Formato Adobe PDF Visualizza/Apri	4.44 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/218135

Il codice NBN di questa tesi è URN:NBN:IT:UNIPD-218135