Hierarchical Classification in Low-Data Settings

Paletto, Lorenzo

Machine Learning has made significant strides in various domains, with classification tasks playing a crucial role in applications such as sentiment analysis, image recognition, and spam detection. However, deep learning models often require large amounts of labeled data, which can be difficult to obtain due to cost, time, and privacy constraints. This challenge has led to the development of Low-Shot Learning techniques, including Few-Shot and Zero-Shot Learning, which enable models to generalize effectively with minimal labeled examples. This thesis explores hierarchical classification within the constraints of lowshot learning, focusing on both textual and visual data. In the textual classification domain, we introduce a novel label augmentation technique leveraging Large Language Models to refine and expand existing label taxonomies. Our approach adds an additional level of semantically meaningful labels by prompting LLMs with structured examples. To evaluate the effectiveness of this enhancement, we propose a set of metrics to quantify taxonomy granularity. Applied to four publicly available hierarchical datasets, our method improves Zero-Shot Hierarchical Text Classification when combined with the Upward Score Propagation technique, achieving state-of-the-art results on three datasets. The strong correlation between classification performance and our proposed metrics underscores the utility of structured label augmentation in hierarchical classification. In the field of computer vision, we propose a novel technique to balance hierarchical information and class separation during prototype positioning. This method utilizes a distance matrix derived from the taxonomy to quantify the severity of misclassifications. Experimental validation confirms improved representation learning in hyperspherical spaces. Additionally, we introduce a contrastive loss function combined with a prototype pruning mechanism, ensuring that representations are repelled only from the most relevant incorrect prototypes rather than all incorrect ones. Evaluations on three hierarchical datasets demonstrate that this strategy outperforms existing state-of-the-art methods. Beyond conventional classification tasks, this thesis explores Zero-Shot Learning to model sentiment diffusion in social networks, particularly in online financial communities such as Reddit. By integrating sentiment analysis, agent-based modeling, and epidemiological theory, we demonstrate that missing sentiment labels can be effectively inferred using zero-shot methods. The proposed model successfully captures large-scale sentiment dynamics, reinforcing the analogy between social contagion and disease spread. Our results contribute to understanding sentiment-driven behaviors in financial markets and highlight the synergy between deep learning and theoretical modeling. Additionally, assuming stationary conditions restricts its applicability in volatile market scenarios. To address these limitations, future research will incorporate stock price movements to capture more nuanced interactions between online discourse and market fluctuations, enhancing the model’s applicability in real-world financial scenarios. In conclusion, this thesis advances the fields of hierarchical classification, lowshot learning, and sentiment analysis in network environments. Using structured label augmentation, prototype learning, and zero-shot techniques, we provide novel methodologies to improve classification performance under data scarcity constraints. Our findings contribute valuable insights into the interplay between machine learning and structured knowledge representation.

Hierarchical Classification in Low-Data Settings

PALETTO, LORENZO

2025

Abstract

Machine Learning has made significant strides in various domains, with classification tasks playing a crucial role in applications such as sentiment analysis, image recognition, and spam detection. However, deep learning models often require large amounts of labeled data, which can be difficult to obtain due to cost, time, and privacy constraints. This challenge has led to the development of Low-Shot Learning techniques, including Few-Shot and Zero-Shot Learning, which enable models to generalize effectively with minimal labeled examples. This thesis explores hierarchical classification within the constraints of lowshot learning, focusing on both textual and visual data. In the textual classification domain, we introduce a novel label augmentation technique leveraging Large Language Models to refine and expand existing label taxonomies. Our approach adds an additional level of semantically meaningful labels by prompting LLMs with structured examples. To evaluate the effectiveness of this enhancement, we propose a set of metrics to quantify taxonomy granularity. Applied to four publicly available hierarchical datasets, our method improves Zero-Shot Hierarchical Text Classification when combined with the Upward Score Propagation technique, achieving state-of-the-art results on three datasets. The strong correlation between classification performance and our proposed metrics underscores the utility of structured label augmentation in hierarchical classification. In the field of computer vision, we propose a novel technique to balance hierarchical information and class separation during prototype positioning. This method utilizes a distance matrix derived from the taxonomy to quantify the severity of misclassifications. Experimental validation confirms improved representation learning in hyperspherical spaces. Additionally, we introduce a contrastive loss function combined with a prototype pruning mechanism, ensuring that representations are repelled only from the most relevant incorrect prototypes rather than all incorrect ones. Evaluations on three hierarchical datasets demonstrate that this strategy outperforms existing state-of-the-art methods. Beyond conventional classification tasks, this thesis explores Zero-Shot Learning to model sentiment diffusion in social networks, particularly in online financial communities such as Reddit. By integrating sentiment analysis, agent-based modeling, and epidemiological theory, we demonstrate that missing sentiment labels can be effectively inferred using zero-shot methods. The proposed model successfully captures large-scale sentiment dynamics, reinforcing the analogy between social contagion and disease spread. Our results contribute to understanding sentiment-driven behaviors in financial markets and highlight the synergy between deep learning and theoretical modeling. Additionally, assuming stationary conditions restricts its applicability in volatile market scenarios. To address these limitations, future research will incorporate stock price movements to capture more nuanced interactions between online discourse and market fluctuations, enhancing the model’s applicability in real-world financial scenarios. In conclusion, this thesis advances the fields of hierarchical classification, lowshot learning, and sentiment analysis in network environments. Using structured label augmentation, prototype learning, and zero-shot techniques, we provide novel methodologies to improve classification performance under data scarcity constraints. Our findings contribute valuable insights into the interplay between machine learning and structured knowledge representation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				MODELING AND DATA SCIENCE
			
	Data di pubblicazione
	
				30-lug-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				ESPOSITO, Roberto
			
	Nome Editore
	
				Università degli Studi di Torino
			
	Collezione di appartenenza
	
				Università degli Studi di Torino

File in questo prodotto:

File	Dimensione	Formato
tesi_una_colonna.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 5.31 MB Formato Adobe PDF Visualizza/Apri	5.31 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/219103

Il codice NBN di questa tesi è URN:NBN:IT:UNITO-219103