With the proliferation of social media, textual emotion analysis is becoming increasingly important. Sentiment Analysis and Emotion Detection can be useful to track several applications. They can be used, for instance, in Customer Relationship Management to track sentiments towards companies and their services, or in Government Intelligence, to collect people's emotions and points of views about government decisions. It is clear that tracking reputation and opinions without appropriate text mining tools is simple infeasible. Most of these tools are based on sentiment and emotion lexicons, in which lemmas are associated with the sentiment and/or emotions they evoke. However, almost all languages but English lack high-coverage inventories of this sort. This thesis presents several sentiment analysis tasks to illustrate challenges and opportunities in this research area. We review different state-of-the-art methods for sentiment analysis and emotion detection and describe how we modeled a framework to build emotive resources, that can be effectively exploited for text affective computing. One of the main outcome of the work presented in this thesis is ItEM, which is a high-coverage Italian EMotive lexicon created by exploiting distributional methods.It has been built with a three stage process including the collection of a set of highly emotive words, their distributional expansion and the validation of the system. Since corpus-based methods reflect the type of the corpus from which they are build, in order to create a reliable lexicon we collected a new Italian corpus, namely FB-NEWS15. This collection has been created by crawling the Facebook pages of the most important Italian newspapers, which typically include a small number of posts written by the journalists and a very high number of comments inspired by long discussions among readers about such news. Finally, we describe some experiments on the sentiment polarity classification of tweets. We started from a system based on supervised learning that was originally developed for the Evalita 2014 SENTIment POLarity Classification task (Basile et al., 2014) and subsequently explored the possibility to enrich this system by exploiting lexical emotive features derived from social media texts.
Distributional Models of Emotions for Sentiment Analysis in Social Media
2017
Abstract
With the proliferation of social media, textual emotion analysis is becoming increasingly important. Sentiment Analysis and Emotion Detection can be useful to track several applications. They can be used, for instance, in Customer Relationship Management to track sentiments towards companies and their services, or in Government Intelligence, to collect people's emotions and points of views about government decisions. It is clear that tracking reputation and opinions without appropriate text mining tools is simple infeasible. Most of these tools are based on sentiment and emotion lexicons, in which lemmas are associated with the sentiment and/or emotions they evoke. However, almost all languages but English lack high-coverage inventories of this sort. This thesis presents several sentiment analysis tasks to illustrate challenges and opportunities in this research area. We review different state-of-the-art methods for sentiment analysis and emotion detection and describe how we modeled a framework to build emotive resources, that can be effectively exploited for text affective computing. One of the main outcome of the work presented in this thesis is ItEM, which is a high-coverage Italian EMotive lexicon created by exploiting distributional methods.It has been built with a three stage process including the collection of a set of highly emotive words, their distributional expansion and the validation of the system. Since corpus-based methods reflect the type of the corpus from which they are build, in order to create a reliable lexicon we collected a new Italian corpus, namely FB-NEWS15. This collection has been created by crawling the Facebook pages of the most important Italian newspapers, which typically include a small number of posts written by the journalists and a very high number of comments inspired by long discussions among readers about such news. Finally, we describe some experiments on the sentiment polarity classification of tweets. We started from a system based on supervised learning that was originally developed for the Evalita 2014 SENTIment POLarity Classification task (Basile et al., 2014) and subsequently explored the possibility to enrich this system by exploiting lexical emotive features derived from social media texts.File | Dimensione | Formato | |
---|---|---|---|
PASSARO_Activity_Report.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
88.26 kB
Formato
Adobe PDF
|
88.26 kB | Adobe PDF | Visualizza/Apri |
PhD_thesis_DII_Passaro.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
1.81 MB
Formato
Adobe PDF
|
1.81 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/130658
URN:NBN:IT:UNIPI-130658