Understanding and predicting human mobility is currently one of the most interesting and challenging objectives of big data analytics, with many scientific issues and large impact applications. In this context, the mobility of visitors within a touristic area (from small cities to whole countries) represents a very specific yet important case, with its own specificities and high economical and social impact. In this thesis, we aim to study methods and algorithms for modeling tourists’ mobility in urban settings, with the twofold objective of better understanding the choice criteria adopted to plan the visit, and predicting the destinations of visitors, which can be valuable tools for city management and the simulation of what-if scenarios. The approaches considered in this work are tailored around social network data sources providing positioning information about their users, and in particular experimental evaluations are performed on Flickr data. In the first part of the work, two main driving criteria for choosing the next visited location of a user are considered: Willingness to move far away vs the popularity of the place to visit. Empirical results on Venice – which is a representative of massively touristic cities – suggest that both play an important role for most visitors, with some minorities almost exclusively driven by only one of them, and virtually nobody moving randomly. In the second and largest part of the thesis, we compare several sequence prediction approaches on the task of predicting the next point-of-interest in a user’s itinerary. The candidate solutions include standard Hidden Markov Models (HMMs), Sequential Rule Mining (SRM), Recurrent Neural Networks (RNNs) and a Hybrid model that mixes the first two basic methods. Empirical evaluations suggest that HMMs and SRM have a limited accuracy in this kind of task, yet with complementary strengths that are successfully exploited by the Hybrid combination model reaching significant improvements. Finally, RNNs showed the best performances among all the approaches, in spite of the relatively limited size of the training dataset available.

Social Media Data Analytics for Tourists' Mobility Modeling and Prediction

2018

Abstract

Understanding and predicting human mobility is currently one of the most interesting and challenging objectives of big data analytics, with many scientific issues and large impact applications. In this context, the mobility of visitors within a touristic area (from small cities to whole countries) represents a very specific yet important case, with its own specificities and high economical and social impact. In this thesis, we aim to study methods and algorithms for modeling tourists’ mobility in urban settings, with the twofold objective of better understanding the choice criteria adopted to plan the visit, and predicting the destinations of visitors, which can be valuable tools for city management and the simulation of what-if scenarios. The approaches considered in this work are tailored around social network data sources providing positioning information about their users, and in particular experimental evaluations are performed on Flickr data. In the first part of the work, two main driving criteria for choosing the next visited location of a user are considered: Willingness to move far away vs the popularity of the place to visit. Empirical results on Venice – which is a representative of massively touristic cities – suggest that both play an important role for most visitors, with some minorities almost exclusively driven by only one of them, and virtually nobody moving randomly. In the second and largest part of the thesis, we compare several sequence prediction approaches on the task of predicting the next point-of-interest in a user’s itinerary. The candidate solutions include standard Hidden Markov Models (HMMs), Sequential Rule Mining (SRM), Recurrent Neural Networks (RNNs) and a Hybrid model that mixes the first two basic methods. Empirical evaluations suggest that HMMs and SRM have a limited accuracy in this kind of task, yet with complementary strengths that are successfully exploited by the Hybrid combination model reaching significant improvements. Finally, RNNs showed the best performances among all the approaches, in spite of the relatively limited size of the training dataset available.
19-ott-2018
Italiano
Pedreschi, Dino
Nanni, Mirco
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
farzad_vaziri_abstract.pdf

Open Access dal 26/10/2021

Tipologia: Altro materiale allegato
Dimensione 51.49 kB
Formato Adobe PDF
51.49 kB Adobe PDF Visualizza/Apri
farzad_vaziri_report.pdf

Open Access dal 26/10/2021

Tipologia: Altro materiale allegato
Dimensione 71.33 kB
Formato Adobe PDF
71.33 kB Adobe PDF Visualizza/Apri
farzad_vaziri_thesis.pdf

Open Access dal 26/10/2021

Tipologia: Altro materiale allegato
Dimensione 4.27 MB
Formato Adobe PDF
4.27 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/133521
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-133521