This doctoral thesis focuses on the analysis of ordinal data, a type of data that has received limited attention in the literature and poses several challenges due to its unique characteristics. The first part of the thesis provides a comprehensive overview of the principal models used for analyzing ordinal data, beginning with the Generalized Linear Models framework and extending to more recent specialized distributions, including the CUB and BOS models. The second part of this thesis presents the research conducted over the past two years, with a particular emphasis on the development of applications, theoretical insights, and new models within the framework of the CUB class introduced in the first section. This part follows a structured progression, allowing the reader to build on each new contribution as they are introduced. The first contribution, presented in Chapter 3, extends the so-called CUM model, a specific approach for analyzing rating data from Semantic Differential Scales. Originally proposed for general use and specifically developed for seven-point scales, this thesis introduces a novel adaptation of the CUM model for analyzing data from fivepoint Semantic Differential Scales. The performances of the model have been tested both with simulation studies and applications to real data. The second contribution, presented in Chapter 4, compares the CUB and CUM models in the context of five- and seven-category scales. Specifically, this work aims to analytically investigate the conditions under which the CUB and CUM models are equivalent. The third contribution, presented in Chapter 5, applies the CUM model to sevenpoint Semantic Differential Scales, with a dual aim. First, it demonstrates how the model works in practice and how it can be used to analyze ordinal data. Second, it offers a valuable contribution to both society and the city of Brescia (Italy), as this research was conducted within the ”DS4BS — Data Science for Brescia“ project, aimed at analyzing visitors’ perceptions of the city’s Art Gallery. The fourth contribution, presented in Chapter 6, was developed during a visiting period at the ERIC Laboratory at the University Lumière Lyon 2 in France. This project introduced a Mixture Model for analyzing rating data within the CUB framework. Simulation studies were conducted to evaluate the model’s performance, and it was subsequently applied to real data to demonstrate its practical application.
Questa tesi di dottorato si concentra sull’analisi dei dati ordinali, un tipo di dati che ha ricevuto una limitata attenzione nella letteratura e che presenta diverse sfide a causa delle sue caratteristiche uniche. La prima parte della tesi fornisce una panoramica esaustiva dei principali modelli utilizzati per l’analisi dei dati ordinali, partendo dal contesto dei Modelli Lineari Generalizzati e arrivando a distribuzioni più recenti e specifiche, come i modelli CUB e BOS. La seconda parte della tesi presenta la ricerca condotta negli ultimi due anni, con particolare enfasi sullo sviluppo di applicazioni, approfondimenti teorici e nuovi modelli all’interno del framework della classe CUB introdotto nella prima sezione. Questa parte segue una progressione strutturata, permettendo al lettore di costruire su ciascun nuovo contributo man mano che viene presentato. Il primo contributo, presentato nel Capitolo 3, estende il cosiddetto modello CUM, un approccio specifico per l’analisi dei dati di valutazione provenienti da scale a differenziale semantico. Originariamente proposto per un uso generale e sviluppato specificamente per scale a sette punti, questa tesi introduce un adattamento innovativo del modello CUM per l’analisi di dati provenienti da scale a differenziale semantico con sette categorie. Le prestazioni del modello sono state testate sia con studi di simulazione che con applicazioni a dati reali. Il secondo contributo, presentato nel Capitolo 4, confronta i modelli CUB e CUM nel contesto di scale con cinque e sette categorie. In particolare, questo lavoro mira a investigare analiticamente le condizioni in cui i modelli CUB e CUM sono equivalenti. Il terzo contributo, presentato nel Capitolo 5, applica il modello CUM alle scale a differenziale semantico con sette categorie, con un duplice obiettivo. In primo luogo, dimostra come il modello funzioni nella pratica e come possa essere utilizzato per analizzare dati ordinali. In secondo luogo, offre un contributo utile sia alla società che alla città di Brescia (Italia), in quanto questa ricerca è stata condotta all’interno del progetto ”DS4BS — Arts and Cultural Places“, con l’obiettivo di analizzare le percezioni dei visitatori della Pinacoteca della città. Il quarto contributo, presentato nel Capitolo 6, è stato sviluppato durante un periodo di visita presso il Laboratorio ERIC dell’Università Lumière Lyon 2 in Francia. Questo progetto ha introdotto un Modello Mistura per l’analisi dei dati di valutazione all’interno del framework CUB. Sono stati condotti studi di simulazione per valutare le prestazioni del modello, che è stato successivamente applicato a dati reali per dimostrarne l’applicazione pratica.
Advances in mixture models for ordinal data: theoretical insights and model-based clustering
Ventura, Matteo
2025
Abstract
This doctoral thesis focuses on the analysis of ordinal data, a type of data that has received limited attention in the literature and poses several challenges due to its unique characteristics. The first part of the thesis provides a comprehensive overview of the principal models used for analyzing ordinal data, beginning with the Generalized Linear Models framework and extending to more recent specialized distributions, including the CUB and BOS models. The second part of this thesis presents the research conducted over the past two years, with a particular emphasis on the development of applications, theoretical insights, and new models within the framework of the CUB class introduced in the first section. This part follows a structured progression, allowing the reader to build on each new contribution as they are introduced. The first contribution, presented in Chapter 3, extends the so-called CUM model, a specific approach for analyzing rating data from Semantic Differential Scales. Originally proposed for general use and specifically developed for seven-point scales, this thesis introduces a novel adaptation of the CUM model for analyzing data from fivepoint Semantic Differential Scales. The performances of the model have been tested both with simulation studies and applications to real data. The second contribution, presented in Chapter 4, compares the CUB and CUM models in the context of five- and seven-category scales. Specifically, this work aims to analytically investigate the conditions under which the CUB and CUM models are equivalent. The third contribution, presented in Chapter 5, applies the CUM model to sevenpoint Semantic Differential Scales, with a dual aim. First, it demonstrates how the model works in practice and how it can be used to analyze ordinal data. Second, it offers a valuable contribution to both society and the city of Brescia (Italy), as this research was conducted within the ”DS4BS — Data Science for Brescia“ project, aimed at analyzing visitors’ perceptions of the city’s Art Gallery. The fourth contribution, presented in Chapter 6, was developed during a visiting period at the ERIC Laboratory at the University Lumière Lyon 2 in France. This project introduced a Mixture Model for analyzing rating data within the CUB framework. Simulation studies were conducted to evaluate the model’s performance, and it was subsequently applied to real data to demonstrate its practical application.File | Dimensione | Formato | |
---|---|---|---|
Thesis_VENTURA_pdfA.pdf
accesso aperto
Dimensione
8.59 MB
Formato
Adobe PDF
|
8.59 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/190465
URN:NBN:IT:UNIBS-190465