Uncertainty: Natural Language Processing and Dynamic Factor Models to understand and forecast Uncertainty dynamics

Morreale, Antonio Pietro Maria

This doctoral thesis studies the measurement, propagation, and macro-financial effects of economic uncertainty through three interconnected applications that combine natural language processing, high-dimensional econometrics, and dynamic factor modelling. Although the three chapters address distinct empirical questions, they are unified by a common methodological concern: how to extract economically meaningful signals from large, noisy, and heterogeneous datasets.The broader motivation for the thesis lies in the growing transformation of empirical macroeconomics and finance into increasingly data-rich disciplines. Economic information is now dispersed across traditional macroeconomic indicators, financial market variables, textual communication, survey expectations, and high-frequency data. Yet many economically relevant objects --- uncertainty, expectations, financial stress, or communication effects --- are not directly observable. They must instead be inferred from imperfect proxies and latent structures embedded in large information sets. This creates a central empirical challenge: how to recover informative low-dimensional signals from complex and high-dimensional environments without losing economic interpretability.Natural language processing and dynamic factor models provide two complementary approaches to this problem. Text-based methods allow researchers to quantify the informational content of language, transforming speeches, news articles, policy statements, and other forms of communication into measurable economic variables. Dynamic factor models, by contrast, provide a statistical framework for summarising co-movements across large panels of variables through a small number of latent common components. While these approaches have often developed separately in the literature, they share a common underlying logic: both seek to recover latent informational structures from high-dimensional data.This thesis argues that the interaction between these methodologies is particularly useful for the study of uncertainty and financial dynamics. Textual indicators provide rich forward-looking information but often suffer from noise, instability, and context dependence. Factor-based approaches help organise that information within a coherent econometric structure capable of separating common movements from idiosyncratic fluctuations and of studying how informational shocks propagate through the macroeconomy and financial markets. MAIN CONTRIBUTIONS OF THE THESIS: The thesis makes three main contributions.First, it provides an integrated discussion of the literature on uncertainty measurement, natural language processing, and dynamic factor models. The survey chapter organises these strands of research around a common informational perspective, emphasising how different approaches --- text-based indicators, econometric uncertainty measures, market-based proxies, and factor-model decompositions --- address related empirical problems despite relying on distinct methodologies. Particular attention is devoted to the growing interaction between textual analysis and high-dimensional econometric modelling.Second, the thesis studies the role of dynamic factor structures in financial volatility forecasting and connectedness analysis. Using the Generalised Dynamic Factor Model within the Euro STOXX 50 universe, the empirical analysis examines whether large cross-sectional information sets improve volatility prediction relative to standard benchmark models. The chapter also investigates the network structure of factor-adjusted idiosyncratic volatility to distinguish common systemic dynamics from asset-specific spillover mechanisms.Third, the thesis analyses the macroeconomic and international effects of Federal Reserve communication. Sentiment extracted from Federal Reserve speeches using lexicon-based and transformer-based NLP methods is embedded in a structural dynamic factor framework to identify communication-related macro-financial shocks and study their transmission to expectations, uncertainty, real activity, and international spillovers. Rather than treating central bank communication as a purely qualitative object, the chapter approaches it as a measurable informational channel operating within a high-dimensional macro-financial system.The first chapter, therefore, provides the conceptual and methodological foundation of the thesis. It reviews the main approaches used to measure economic uncertainty, surveys modern NLP methods in economics and finance, and discusses the econometric logic of dynamic factor models and structural identification in large information sets. The chapter places particular emphasis on the links between textual indicators and factor-based frameworks, highlighting both their complementarities and their unresolved methodological challenges.The second chapter moves from measurement to modelling. It applies the Generalised Dynamic Factor Model to realised volatility measures for Euro STOXX 50 firms, comparing its forecasting performance with standard alternatives and studying the evolution of factor-adjusted network connectedness. The analysis is motivated by the idea that financial volatility is partly driven by pervasive common forces that cannot be adequately captured by purely univariate approaches.The third chapter studies Federal Reserve communication and its macroeconomic consequences. The analysis combines textual sentiment extraction with structural dynamic factor methods and panel local projections to investigate how communication-related shocks propagate through domestic and international macro-financial variables. The chapter also examines differences between advanced and emerging economies, emphasising the role of global financial linkages in the international transmission of uncertainty and policy communication.Taken together, the three chapters approach uncertainty, volatility, and communication as informational phenomena whose empirical analysis requires methods capable of handling dimensionality, heterogeneity, and latent dependence structures. The objective of the thesis is not to claim that a single framework can fully resolve the problems of measurement or identification that arise in this literature. Rather, the thesis aims to show that combining textual analysis with high-dimensional econometric methods can provide a richer and more disciplined understanding of how information is generated, transmitted, and absorbed in modern economic and financial systems.

Uncertainty: Natural Language Processing and Dynamic Factor Models to understand and forecast Uncertainty dynamics

MORREALE, ANTONIO PIETRO MARIA

2026

Abstract

This doctoral thesis studies the measurement, propagation, and macro-financial effects of economic uncertainty through three interconnected applications that combine natural language processing, high-dimensional econometrics, and dynamic factor modelling. Although the three chapters address distinct empirical questions, they are unified by a common methodological concern: how to extract economically meaningful signals from large, noisy, and heterogeneous datasets.The broader motivation for the thesis lies in the growing transformation of empirical macroeconomics and finance into increasingly data-rich disciplines. Economic information is now dispersed across traditional macroeconomic indicators, financial market variables, textual communication, survey expectations, and high-frequency data. Yet many economically relevant objects --- uncertainty, expectations, financial stress, or communication effects --- are not directly observable. They must instead be inferred from imperfect proxies and latent structures embedded in large information sets. This creates a central empirical challenge: how to recover informative low-dimensional signals from complex and high-dimensional environments without losing economic interpretability.Natural language processing and dynamic factor models provide two complementary approaches to this problem. Text-based methods allow researchers to quantify the informational content of language, transforming speeches, news articles, policy statements, and other forms of communication into measurable economic variables. Dynamic factor models, by contrast, provide a statistical framework for summarising co-movements across large panels of variables through a small number of latent common components. While these approaches have often developed separately in the literature, they share a common underlying logic: both seek to recover latent informational structures from high-dimensional data.This thesis argues that the interaction between these methodologies is particularly useful for the study of uncertainty and financial dynamics. Textual indicators provide rich forward-looking information but often suffer from noise, instability, and context dependence. Factor-based approaches help organise that information within a coherent econometric structure capable of separating common movements from idiosyncratic fluctuations and of studying how informational shocks propagate through the macroeconomy and financial markets. MAIN CONTRIBUTIONS OF THE THESIS: The thesis makes three main contributions.First, it provides an integrated discussion of the literature on uncertainty measurement, natural language processing, and dynamic factor models. The survey chapter organises these strands of research around a common informational perspective, emphasising how different approaches --- text-based indicators, econometric uncertainty measures, market-based proxies, and factor-model decompositions --- address related empirical problems despite relying on distinct methodologies. Particular attention is devoted to the growing interaction between textual analysis and high-dimensional econometric modelling.Second, the thesis studies the role of dynamic factor structures in financial volatility forecasting and connectedness analysis. Using the Generalised Dynamic Factor Model within the Euro STOXX 50 universe, the empirical analysis examines whether large cross-sectional information sets improve volatility prediction relative to standard benchmark models. The chapter also investigates the network structure of factor-adjusted idiosyncratic volatility to distinguish common systemic dynamics from asset-specific spillover mechanisms.Third, the thesis analyses the macroeconomic and international effects of Federal Reserve communication. Sentiment extracted from Federal Reserve speeches using lexicon-based and transformer-based NLP methods is embedded in a structural dynamic factor framework to identify communication-related macro-financial shocks and study their transmission to expectations, uncertainty, real activity, and international spillovers. Rather than treating central bank communication as a purely qualitative object, the chapter approaches it as a measurable informational channel operating within a high-dimensional macro-financial system.The first chapter, therefore, provides the conceptual and methodological foundation of the thesis. It reviews the main approaches used to measure economic uncertainty, surveys modern NLP methods in economics and finance, and discusses the econometric logic of dynamic factor models and structural identification in large information sets. The chapter places particular emphasis on the links between textual indicators and factor-based frameworks, highlighting both their complementarities and their unresolved methodological challenges.The second chapter moves from measurement to modelling. It applies the Generalised Dynamic Factor Model to realised volatility measures for Euro STOXX 50 firms, comparing its forecasting performance with standard alternatives and studying the evolution of factor-adjusted network connectedness. The analysis is motivated by the idea that financial volatility is partly driven by pervasive common forces that cannot be adequately captured by purely univariate approaches.The third chapter studies Federal Reserve communication and its macroeconomic consequences. The analysis combines textual sentiment extraction with structural dynamic factor methods and panel local projections to investigate how communication-related shocks propagate through domestic and international macro-financial variables. The chapter also examines differences between advanced and emerging economies, emphasising the role of global financial linkages in the international transmission of uncertainty and policy communication.Taken together, the three chapters approach uncertainty, volatility, and communication as informational phenomena whose empirical analysis requires methods capable of handling dimensionality, heterogeneity, and latent dependence structures. The objective of the thesis is not to claim that a single framework can fully resolve the problems of measurement or identification that arise in this literature. Rather, the thesis aims to show that combining textual analysis with high-dimensional econometric methods can provide a richer and more disciplined understanding of how information is generated, transmitted, and absorbed in modern economic and financial systems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2-lug-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Soccorsi, Stefano
MUGGEO, Vito Michele Rosario
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				MUGGEO, Vito Michele Rosario
			
	Nome Editore
	
				Università degli Studi di Palermo
			
	Città Editore
	
				Palermo
			
	Numero di pagine
	
				147
			
	Collezione di appartenenza
	
				Università degli Studi di Palermo

File in questo prodotto:

File	Dimensione	Formato
UncertaintyNaturalLanguageProcessingandDynamicFactorModelstounderstandandforecastUncertaintydynamics.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 7.26 MB Formato Adobe PDF Visualizza/Apri	7.26 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/373326

Il codice NBN di questa tesi è URN:NBN:IT:UNIPA-373326