In the last few years there has been a significant increase in the collection of mobility data. By mobility data we refer to the collection of positioning data, called trajectories, of tracked moving objects. These objects could be humans, animals, vehicles or other devices like Internet of Things (IoT). The analysis of such data has been proved to be useful in several application domains from a urban scenario for traffic prediction or transportation means optimization, to maritime domain analysing vessels paths or environmental domain with the study of hurricanes evolution or animal behavior. One of the most typical and used analysis task on mobility data is classification, where trajectory data is automatically assigned a label or class. The explosion of social media data, sensors, IoT, and Internet-enabled sources allowed the semantic enrichment of such mobility data, which evolved from raw spatio-temporal data to high dimensional data. Mobility analysis, and specifically classification task, on such high dimensional data becomes therefore more challenging. In fact, existing trajectory classification methods have mainly considered space, time, and numerical data, ignoring the large number of semantic dimensions. Only recently research community proposed classification methods based on the concept of movelets that are the parts of a trajectory that better discriminate a class and that can therefore improve classification accuracy. State of the art methods in movelets extraction are computationally inefficient, which makes them unfeasible to be used for real large high dimensional datasets. The objective of this thesis is therefore to develop new algorithms for discovering movelets that are faster than state of the art while maintaining or improving classification accuracy. Our main contribution is a new high performance method for extracting movelets and classifying trajectories, called HiPerMovelets (High-performance Movelets). Experimental results show that HiPerMovelets is 10 times faster than the best state of the art method, reduces the high dimensionality problem, is more scalable, and presents a high classification accuracy in all evaluated datasets. A secondary contribution are the algorithms RandomMovelets and UltraMovelets. RandomMovelets reduces the search space by randomly extracting subtrajectories and evaluating their relevance for classification without exploring the entire dataset. UltraMovelets reduces the combinatorial explosion when exploring subtrajectories. Preliminary results suggest that these methods can reduce the search space, use less computational resources, and are at least 6 times faster than baselines.

Towards Optimization Methods for Movelets Extraction in Multiple Aspect Trajectory Classification

TORTELLI PORTELA, TARLIS
2023

Abstract

In the last few years there has been a significant increase in the collection of mobility data. By mobility data we refer to the collection of positioning data, called trajectories, of tracked moving objects. These objects could be humans, animals, vehicles or other devices like Internet of Things (IoT). The analysis of such data has been proved to be useful in several application domains from a urban scenario for traffic prediction or transportation means optimization, to maritime domain analysing vessels paths or environmental domain with the study of hurricanes evolution or animal behavior. One of the most typical and used analysis task on mobility data is classification, where trajectory data is automatically assigned a label or class. The explosion of social media data, sensors, IoT, and Internet-enabled sources allowed the semantic enrichment of such mobility data, which evolved from raw spatio-temporal data to high dimensional data. Mobility analysis, and specifically classification task, on such high dimensional data becomes therefore more challenging. In fact, existing trajectory classification methods have mainly considered space, time, and numerical data, ignoring the large number of semantic dimensions. Only recently research community proposed classification methods based on the concept of movelets that are the parts of a trajectory that better discriminate a class and that can therefore improve classification accuracy. State of the art methods in movelets extraction are computationally inefficient, which makes them unfeasible to be used for real large high dimensional datasets. The objective of this thesis is therefore to develop new algorithms for discovering movelets that are faster than state of the art while maintaining or improving classification accuracy. Our main contribution is a new high performance method for extracting movelets and classifying trajectories, called HiPerMovelets (High-performance Movelets). Experimental results show that HiPerMovelets is 10 times faster than the best state of the art method, reduces the high dimensionality problem, is more scalable, and presents a high classification accuracy in all evaluated datasets. A secondary contribution are the algorithms RandomMovelets and UltraMovelets. RandomMovelets reduces the search space by randomly extracting subtrajectories and evaluating their relevance for classification without exploring the entire dataset. UltraMovelets reduces the combinatorial explosion when exploring subtrajectories. Preliminary results suggest that these methods can reduce the search space, use less computational resources, and are at least 6 times faster than baselines.
22-apr-2023
Italiano
data mining
movelets
multiple aspect trajectories
relevant subtrajectories
trajectory classification
Bernasconi, Anna
Renso, Chiara
File in questo prodotto:
File Dimensione Formato  
2023_Thesis_Tarlis_UniPi.pdf

accesso aperto

Dimensione 4.58 MB
Formato Adobe PDF
4.58 MB Adobe PDF Visualizza/Apri
2023__Activities_Report__Tarlis_Portela.pdf

non disponibili

Dimensione 209.39 kB
Formato Adobe PDF
209.39 kB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/215492
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-215492