This dissertation adopts natural language processing (NLP) methods to develop transparent and reproducible measures of green invention, artificial intelligence (AI) capabilities, and firm engagement with the sustainable development goals (SDGs). The second Chapter introduces a text-based methodology to identify green patents using word embeddings trained on over 12 million patent abstracts. The analysis shows that only about 18.5% of patents labeled as “green” in existing classifications reflect genuine environmental content, while more than 1 mil- lion environmentally relevant patents are missed. Linking these refined measures to European firm-level data reveals that authentic green patenting is associated with higher sales, productivity, and market share. The third Chapter maps global AI patented innovation from 2010 to 2023. It documents a sharply concentrated system dominated by the U.S. and China and highlights the central role of large multinationals. Us- ing technological proximity indicators, citation survival anal- ysis, and a gravity model of knowledge flows, the chapter offers new evidence on how AI capabilities diffuse and as- sesses Europe’s mixed position in the global AI landscape. The fourth Chapter develops an NLP-based measure of SDG engagement using website text from more than 10,000 Italian firms (2018–2023). The results reveal heterogeneous patterns of sustainability communication and show that economic re- turns vary markedly across SDGs. Across all chapters, the dissertation demonstrates the value of computational text anal- ysis for improving measurement in innovation and corporat strategy research and provides new empirical insights rele- vant to industrial policy and firm-level decision-making.

Decoding Innovation: A Text-Based Approach to AI, Sustainability, and Firm Performance

SANTARLASCI, LAPO
2026

Abstract

This dissertation adopts natural language processing (NLP) methods to develop transparent and reproducible measures of green invention, artificial intelligence (AI) capabilities, and firm engagement with the sustainable development goals (SDGs). The second Chapter introduces a text-based methodology to identify green patents using word embeddings trained on over 12 million patent abstracts. The analysis shows that only about 18.5% of patents labeled as “green” in existing classifications reflect genuine environmental content, while more than 1 mil- lion environmentally relevant patents are missed. Linking these refined measures to European firm-level data reveals that authentic green patenting is associated with higher sales, productivity, and market share. The third Chapter maps global AI patented innovation from 2010 to 2023. It documents a sharply concentrated system dominated by the U.S. and China and highlights the central role of large multinationals. Us- ing technological proximity indicators, citation survival anal- ysis, and a gravity model of knowledge flows, the chapter offers new evidence on how AI capabilities diffuse and as- sesses Europe’s mixed position in the global AI landscape. The fourth Chapter develops an NLP-based measure of SDG engagement using website text from more than 10,000 Italian firms (2018–2023). The results reveal heterogeneous patterns of sustainability communication and show that economic re- turns vary markedly across SDGs. Across all chapters, the dissertation demonstrates the value of computational text anal- ysis for improving measurement in innovation and corporat strategy research and provides new empirical insights rele- vant to industrial policy and firm-level decision-making.
27-apr-2026
Inglese
RUNGI, ARMANDO
Scuola IMT Alti Studi di Lucca
Lucca, Italy
231
File in questo prodotto:
File Dimensione Formato  
Thesis_Lapo_Santarlasci_final.pdf

embargo fino al 30/04/2029

Licenza: Tutti i diritti riservati
Dimensione 2.72 MB
Formato Adobe PDF
2.72 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/368587
Il codice NBN di questa tesi è URN:NBN:IT:IMTLUCCA-368587