In today’s rapidly evolving labor market, the value of skill taxonomies such as the European Skills, Competences, Qualifications and Occupations ( ESCO ) and the Occupational Information Network ( O*NET) is increasingly challenged by their static nature. While these resources provide a crucial backbone for Labour Market Intelligence ( LMI ), they struggle to keep pace with the emergence of new skills driven by digitalization and the green transition. To bridge this gap, Online Job Advertisements (OJAs ) have emerged as an invaluable, real-time source of information on the demand side of the labor market. This reality frames the central question of this doctoral thesis: how can we systematically leverage the dynamic language of Online Job Advertisement ( OJA)s to keep official skill taxonomies continuously updated? The research presented here, developed in close connection with institutional requests, offers a series of methodological contributions that blend advances in natural language processing with practical implementations for LMI. The thesis is structured in three main parts. The first provides the conceptual and empirical background. The second presents the core methodological contributions, showcasing a progression of enrichment pipelines: first a foundational data-driven pipeline for ESCO ’s green taxonomy; then the TAXMAP framework, a more sophisticated Artificial Intelligence ( AI) approach for digital skills; and finally the SkiLLens pipeline, which extends these methods into a large-scale, multilingual system for detecting emerging skills across Europe. The third and last part focuses on the impact of these enriched taxonomies, using a case study on green skills to show how they enable more dynamic economic and social analyses, providing policymakers with timely intelligence on skill transitions, education requirements, and wage differentials. Overall, the thesis argues that enriched and adaptive taxonomies are central infrastructures for the future of LMI, bridging methodological innovation with pressing policy needs. By connecting the fields of artificial intelligence and economic analysis, this work highlights the transformative potential of data-driven taxonomies and underscores that keeping them continuously updated is essential for advancing evidence-based research on the dynamics of work and skills
Nell'odierno mercato del lavoro, l'efficacia di tassonomie di skills come la Classificazione europea delle Competenze, Qualifiche Occupazioni (ESCO) e l'Occupational Information Network (O*NET) è minacciata dalla loro natura statica. Sebbene tali risorse costituiscano un pilastro fondamentale per la Labour Market Intelligence (LMI), esse faticano a tenere il passo con l'emergere di nuove professionalità e competenze, guidato dalla digitalizzazione e dalla transizione ecologica. Per colmare questo divario, gli Annunci di Lavoro Online (Online Job Ads, OJA) si sono affermati come una preziosa fonte di informazioni in tempo reale sul lato della domanda. In questo contesto si inserisce la domanda centrale che guida questa tesi di dottorato: come è possibile sfruttare la dinamicità e la ricchezza informativa degli annunci di lavoro online per mantenere le tassonomie ufficiali costantemente aggiornate, pertinenti e utili alle analisi del mercato del lavoro? La ricerca qui presentata, sviluppata in risposta a concrete esigenze istituzionali, offre una serie di contributi metodologici che combinano i progressi dell'elaborazione del linguaggio naturale (NLP) con applicazioni pratiche per la LMI. La tesi si articola in tre parti. La prima delinea il quadro concettuale e metodologico, analizzando le tecniche di word embedding e i principali approcci per l'arricchimento automatico delle tassonomie nel contesto della LMI. La seconda e centrale parte della tesi illustra i contributi metodologici, presentando una progressione di tre pipeline di arricchimento: una prima pipeline per la tassonomia "green" di ESCO, fondata sul modello di word embedding FastText; il framework TAXMAP, un approccio più sofisticato che sfrutta l'Intelligenza Artificiale (IA) e i Large Language Models per arricchire la tassonomia delle competenze digitali; e infine SkiLLens, un sistema che estende questi metodi a un contesto multilingue su larga scala per l'individuazione di competenze emergenti in Europa. La terza e ultima parte valorizza l'impatto di tali strumenti aggiornati. Attraverso un caso di studio sulle competenze "green", si dimostra come tassonomie arricchite consentano analisi economiche e sociali più dinamiche, fornendo ai decisori politici informazioni tempestive sulle transizioni delle competenze, sui requisiti formativi e sui differenziali salariali. Nel complesso, questo lavoro sostiene che le tassonomie arricchite e dinamiche rappresentino infrastrutture cruciali per il futuro della LMI, capaci di saldare l'innovazione metodologica con le urgenti necessità politiche. Collegando i campi dell'intelligenza artificiale e dell'analisi economica, la tesi evidenzia il potenziale trasformativo delle tassonomie basate sui dati e sottolinea come il loro costante aggiornamento sia un prerequisito essenziale per far progredire la ricerca evidence-based sulle dinamiche del lavoro e delle competenze.
Novel approaches of Taxonomy enrichment via distributional semantics
DE SANTO, ALESSIA
2026
Abstract
In today’s rapidly evolving labor market, the value of skill taxonomies such as the European Skills, Competences, Qualifications and Occupations ( ESCO ) and the Occupational Information Network ( O*NET) is increasingly challenged by their static nature. While these resources provide a crucial backbone for Labour Market Intelligence ( LMI ), they struggle to keep pace with the emergence of new skills driven by digitalization and the green transition. To bridge this gap, Online Job Advertisements (OJAs ) have emerged as an invaluable, real-time source of information on the demand side of the labor market. This reality frames the central question of this doctoral thesis: how can we systematically leverage the dynamic language of Online Job Advertisement ( OJA)s to keep official skill taxonomies continuously updated? The research presented here, developed in close connection with institutional requests, offers a series of methodological contributions that blend advances in natural language processing with practical implementations for LMI. The thesis is structured in three main parts. The first provides the conceptual and empirical background. The second presents the core methodological contributions, showcasing a progression of enrichment pipelines: first a foundational data-driven pipeline for ESCO ’s green taxonomy; then the TAXMAP framework, a more sophisticated Artificial Intelligence ( AI) approach for digital skills; and finally the SkiLLens pipeline, which extends these methods into a large-scale, multilingual system for detecting emerging skills across Europe. The third and last part focuses on the impact of these enriched taxonomies, using a case study on green skills to show how they enable more dynamic economic and social analyses, providing policymakers with timely intelligence on skill transitions, education requirements, and wage differentials. Overall, the thesis argues that enriched and adaptive taxonomies are central infrastructures for the future of LMI, bridging methodological innovation with pressing policy needs. By connecting the fields of artificial intelligence and economic analysis, this work highlights the transformative potential of data-driven taxonomies and underscores that keeping them continuously updated is essential for advancing evidence-based research on the dynamics of work and skills| File | Dimensione | Formato | |
|---|---|---|---|
|
phd_unimib_887637.pdf
accesso aperto
Licenza:
Tutti i diritti riservati
Dimensione
5.09 MB
Formato
Adobe PDF
|
5.09 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/361310
URN:NBN:IT:UNIMIB-361310