Big data, specifically Telecom Metadata, opens new opportunities for human behavior understanding, applying machine learning and big data processing computational methods combined with interdisciplinary knowledge of human behavior. In this thesis new methods are developed for human behavior predictive modeling based on anonymized telecom metadata on individual level and on large scale group level, which were studied during research projects held in 2012-2016 in collaboration with Telecom Italia, Telefonica Research, MIT Media Lab and University of Trento. It is shown that human dynamics patterns could be reliably recognized based on human behavior metrics derived from the mobile phone and cellular network activity (call log, sms log, bluetooth interactions, internet consumption). On individual level the results are validated on use cases of detecting daily stress and estimating subjective happiness. An original approach is introduced for feature extraction, selection, recognition model training and validation. Experimental results based on ensemble stochastic classification and regression tree models are discussed. On large group level, following big data for social good challenges, the problem of crime hotspot prediction is formulated and solved. In the proposed approach we use demographic information along with human mobility characteristics as derived from anonymized and aggregated mobile network data. The models, built on and evaluated against real crime data from London, obtain accuracy of almost 70% when classifying whether a specific area in the city will be a crime hotspot or not in the following month. Electric energy consumption patterns are correlated with human behavior patterns in highly nonlinear way. Second large scale group behavior prediction result is formulated as predicting next week energy consumption based on human dynamics analysis derived out of the anonymized and aggregated telecom data, processed from GSM network call detail records (CDRs). The proposed solution could act on energy producers/distributors as an essential aid to smart meters data for making better decisions in reducing total primary energy consumption by limiting energy production when the demand is not predicted, reducing energy distribution costs by efficient buy-side planning in time and providing insights for peak load planning in geographic space. All the studied experimental results combine the introduced methodology, which is efficient to implement for most of multimedia and real-time applications due to highly reduced low-dimensional feature space and reduced machine learning pipelines. Also the indicators which have strong predictive power are discussed opening new horizons for computational social science studies.

Predictive Modeling of Human Behavior: Supervised Learning from Telecom Metadata

Bogomolov, Andrey
2017

Abstract

Big data, specifically Telecom Metadata, opens new opportunities for human behavior understanding, applying machine learning and big data processing computational methods combined with interdisciplinary knowledge of human behavior. In this thesis new methods are developed for human behavior predictive modeling based on anonymized telecom metadata on individual level and on large scale group level, which were studied during research projects held in 2012-2016 in collaboration with Telecom Italia, Telefonica Research, MIT Media Lab and University of Trento. It is shown that human dynamics patterns could be reliably recognized based on human behavior metrics derived from the mobile phone and cellular network activity (call log, sms log, bluetooth interactions, internet consumption). On individual level the results are validated on use cases of detecting daily stress and estimating subjective happiness. An original approach is introduced for feature extraction, selection, recognition model training and validation. Experimental results based on ensemble stochastic classification and regression tree models are discussed. On large group level, following big data for social good challenges, the problem of crime hotspot prediction is formulated and solved. In the proposed approach we use demographic information along with human mobility characteristics as derived from anonymized and aggregated mobile network data. The models, built on and evaluated against real crime data from London, obtain accuracy of almost 70% when classifying whether a specific area in the city will be a crime hotspot or not in the following month. Electric energy consumption patterns are correlated with human behavior patterns in highly nonlinear way. Second large scale group behavior prediction result is formulated as predicting next week energy consumption based on human dynamics analysis derived out of the anonymized and aggregated telecom data, processed from GSM network call detail records (CDRs). The proposed solution could act on energy producers/distributors as an essential aid to smart meters data for making better decisions in reducing total primary energy consumption by limiting energy production when the demand is not predicted, reducing energy distribution costs by efficient buy-side planning in time and providing insights for peak load planning in geographic space. All the studied experimental results combine the introduced methodology, which is efficient to implement for most of multimedia and real-time applications due to highly reduced low-dimensional feature space and reduced machine learning pipelines. Also the indicators which have strong predictive power are discussed opening new horizons for computational social science studies.
2017
Inglese
Pianesi, Fabio
Università degli studi di Trento
TRENTO
78
File in questo prodotto:
File Dimensione Formato  
Disclaimer_Bogomolov.pdf

accesso solo da BNCF e BNCR

Dimensione 291.5 kB
Formato Adobe PDF
291.5 kB Adobe PDF
bogomolov-thesis-open-access.pdf

accesso aperto

Dimensione 6.26 MB
Formato Adobe PDF
6.26 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/93178
Il codice NBN di questa tesi è URN:NBN:IT:UNITN-93178