The discussion about environmental sustainability has increased in recent years, as these issues are vital for people and must be solved by the population and the government together. Governments, especially, need to find a way to efficiently perform data-driven analysis to solve a particular environmental sustainability issue in their country/region. In recent years, previous works have been researching sentiment analysis on these topics to detect if there is an environmental issue discussed by people, and frame analysis to analyze how people frame issues in these topics since different frames may affect how people act and react to environmental sustainability issues. However, NLP applications for sentiment and frame analysis in environmental sustainability topics are far behind and relatively new compared to general topics. Currently, there is no open dataset for structured sentiment analysis in environmental sustainability topics, while this fine-grain scheme is crucial to gaining meaningful sentiment analysis information. As there is no open dataset, performing fine-grained sentiment analysis in this topic will be challenging, which necessitates finding a method that can perform this task with limited training data. For frame analysis, we found that previous works follow various frame paradigms that heavily rely on manual annotation and analysis processes, which makes them difficult to replicate, whereas this task is crucial for the government, as policymakers can understand the dominant narratives about this topic, allowing them to tailor communication to promote more effective environmental policies. Lastly, for other tasks related to frame analysis, we do not find any previous works that have done research in style transfer that considers frame semantic spoken by politicians, while this topic is quite important since how politicians frame environmental sustainability issue may influence people and scientists’ discourse. As there is no previous work, we need a method to perform styles transfer for this topic in a non-parallel dataset. In this thesis, we addressed several issues mentioned above. First, we conducted systematic reviews to deeply explore the current trends and biggest challenges in sentiment and frame analysis applied on environmental sustainability topics. By proposing a new framework for automatic span-level annotation aggregation and evaluation, we present EnviS, a new multilingual dataset for structured sentiment analysis in environmental sustainability topics. Then, we benchmark EnviS using LLMs and encoder-based dependency graph parser models. To improve the LLMs’ performance for sentiment analysis, we introduce BaKGen, a framework for generating background knowledge as an alternative to classical few-shot prompting, which provides more comprehensive information for the prompt. The experiment results on sentiment expression classification show that background knowledge injection derived from BacKGen outperforms classical few-shot with a significant error reduction to the zero-shot. For frame analysis, we present ParlED, a multilingual corpus for frame analysis of European parliamentary debates on environmental sustainability topics. Following a frame semantic paradigm with FFICF score as the metric to detect the typical frame, we show that this approach can be used to analyze frames used by the European politicians when communicating environmental sustainability issues in multidimensional aspect including time, topics, and political ideology perspective efficiently without manual intervention. Lastly, we introduce a new task, namely political ideology and frame semantic style transfer in environmental sustainability topics. To perform this proposed task, we introduce a new method for selecting examples for few-shot prompting in a non parallel dataset, where the experiment results show that this proposed method generally increases the performance compared to the zero-shot baseline.

SEMANTIC-ORIENTED NATURAL LANGUAGE PROCESSING FOR SUSTAINABILITY: STRUCTURED SENTIMENT ANALYSIS, FRAME ANALYSIS, AND MULTI-ATTRIBUTE STYLES TRANSFER

IBROHIM, MUHAMMAD OKKY
2025

Abstract

The discussion about environmental sustainability has increased in recent years, as these issues are vital for people and must be solved by the population and the government together. Governments, especially, need to find a way to efficiently perform data-driven analysis to solve a particular environmental sustainability issue in their country/region. In recent years, previous works have been researching sentiment analysis on these topics to detect if there is an environmental issue discussed by people, and frame analysis to analyze how people frame issues in these topics since different frames may affect how people act and react to environmental sustainability issues. However, NLP applications for sentiment and frame analysis in environmental sustainability topics are far behind and relatively new compared to general topics. Currently, there is no open dataset for structured sentiment analysis in environmental sustainability topics, while this fine-grain scheme is crucial to gaining meaningful sentiment analysis information. As there is no open dataset, performing fine-grained sentiment analysis in this topic will be challenging, which necessitates finding a method that can perform this task with limited training data. For frame analysis, we found that previous works follow various frame paradigms that heavily rely on manual annotation and analysis processes, which makes them difficult to replicate, whereas this task is crucial for the government, as policymakers can understand the dominant narratives about this topic, allowing them to tailor communication to promote more effective environmental policies. Lastly, for other tasks related to frame analysis, we do not find any previous works that have done research in style transfer that considers frame semantic spoken by politicians, while this topic is quite important since how politicians frame environmental sustainability issue may influence people and scientists’ discourse. As there is no previous work, we need a method to perform styles transfer for this topic in a non-parallel dataset. In this thesis, we addressed several issues mentioned above. First, we conducted systematic reviews to deeply explore the current trends and biggest challenges in sentiment and frame analysis applied on environmental sustainability topics. By proposing a new framework for automatic span-level annotation aggregation and evaluation, we present EnviS, a new multilingual dataset for structured sentiment analysis in environmental sustainability topics. Then, we benchmark EnviS using LLMs and encoder-based dependency graph parser models. To improve the LLMs’ performance for sentiment analysis, we introduce BaKGen, a framework for generating background knowledge as an alternative to classical few-shot prompting, which provides more comprehensive information for the prompt. The experiment results on sentiment expression classification show that background knowledge injection derived from BacKGen outperforms classical few-shot with a significant error reduction to the zero-shot. For frame analysis, we present ParlED, a multilingual corpus for frame analysis of European parliamentary debates on environmental sustainability topics. Following a frame semantic paradigm with FFICF score as the metric to detect the typical frame, we show that this approach can be used to analyze frames used by the European politicians when communicating environmental sustainability issues in multidimensional aspect including time, topics, and political ideology perspective efficiently without manual intervention. Lastly, we introduce a new task, namely political ideology and frame semantic style transfer in environmental sustainability topics. To perform this proposed task, we introduce a new method for selecting examples for few-shot prompting in a non parallel dataset, where the experiment results show that this proposed method generally increases the performance compared to the zero-shot baseline.
26-giu-2025
Inglese
BOSCO, Cristina
BASILE, Valerio
Università degli Studi di Torino
File in questo prodotto:
File Dimensione Formato  
FinalThesis_MuhammadOkkyIbrohim.pdf

accesso aperto

Dimensione 1.91 MB
Formato Adobe PDF
1.91 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/214161
Il codice NBN di questa tesi è URN:NBN:IT:UNITO-214161