With the proliferation of social media, textual emotion analysis is becoming increasingly important. Sentiment Analysis and Emotion Detection can be useful to track several applications. They can be used, for instance, in Customer Relationship Management to track sentiments towards companies and their services, or in Government Intelligence, to collect people's emotions and points of views about government decisions. It is clear that tracking reputation and opinions without appropriate text mining tools is simple infeasible. Most of these tools are based on sentiment and emotion lexicons, in which lemmas are associated with the sentiment and/or emotions they evoke. However, almost all languages but English lack high-coverage inventories of this sort. This thesis presents several sentiment analysis tasks to illustrate challenges and opportunities in this research area. We review different state-of-the-art methods for sentiment analysis and emotion detection and describe how we modeled a framework to build emotive resources, that can be effectively exploited for text affective computing. One of the main outcome of the work presented in this thesis is ItEM, which is a high-coverage Italian EMotive lexicon created by exploiting distributional methods.It has been built with a three stage process including the collection of a set of highly emotive words, their distributional expansion and the validation of the system. Since corpus-based methods reflect the type of the corpus from which they are build, in order to create a reliable lexicon we collected a new Italian corpus, namely FB-NEWS15. This collection has been created by crawling the Facebook pages of the most important Italian newspapers, which typically include a small number of posts written by the journalists and a very high number of comments inspired by long discussions among readers about such news. Finally, we describe some experiments on the sentiment polarity classification of tweets. We started from a system based on supervised learning that was originally developed for the Evalita 2014 SENTIment POLarity Classification task (Basile et al., 2014) and subsequently explored the possibility to enrich this system by exploiting lexical emotive features derived from social media texts.

Distributional Models of Emotions for Sentiment Analysis in Social Media

2017

Abstract

With the proliferation of social media, textual emotion analysis is becoming increasingly important. Sentiment Analysis and Emotion Detection can be useful to track several applications. They can be used, for instance, in Customer Relationship Management to track sentiments towards companies and their services, or in Government Intelligence, to collect people's emotions and points of views about government decisions. It is clear that tracking reputation and opinions without appropriate text mining tools is simple infeasible. Most of these tools are based on sentiment and emotion lexicons, in which lemmas are associated with the sentiment and/or emotions they evoke. However, almost all languages but English lack high-coverage inventories of this sort. This thesis presents several sentiment analysis tasks to illustrate challenges and opportunities in this research area. We review different state-of-the-art methods for sentiment analysis and emotion detection and describe how we modeled a framework to build emotive resources, that can be effectively exploited for text affective computing. One of the main outcome of the work presented in this thesis is ItEM, which is a high-coverage Italian EMotive lexicon created by exploiting distributional methods.It has been built with a three stage process including the collection of a set of highly emotive words, their distributional expansion and the validation of the system. Since corpus-based methods reflect the type of the corpus from which they are build, in order to create a reliable lexicon we collected a new Italian corpus, namely FB-NEWS15. This collection has been created by crawling the Facebook pages of the most important Italian newspapers, which typically include a small number of posts written by the journalists and a very high number of comments inspired by long discussions among readers about such news. Finally, we describe some experiments on the sentiment polarity classification of tweets. We started from a system based on supervised learning that was originally developed for the Evalita 2014 SENTIment POLarity Classification task (Basile et al., 2014) and subsequently explored the possibility to enrich this system by exploiting lexical emotive features derived from social media texts.
16-gen-2017
Italiano
Lenci, Alessandro
Marcelloni, Francesco
Vaglini, Gigliola
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
PASSARO_Activity_Report.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 88.26 kB
Formato Adobe PDF
88.26 kB Adobe PDF Visualizza/Apri
PhD_thesis_DII_Passaro.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 1.81 MB
Formato Adobe PDF
1.81 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/130658
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-130658