Simultaneous localisation and mapping (SLAM) is a technique studied in computer vision and robotics that, given measurements obtained from one or more sensors, allows incremental building of a map of the environment and simultaneous estimation of the position and orientation of the very same sensor used to acquire the input data. Visual SLAM systems typically allow the generation of accurate reconstructions of the explored environment but, until very recently, did not provide high level informations on the contents of the reconstructed scenes, useful to foster high level reasoning by subsequent algorithms. In this thesis we focus on the topic of Semantic SLAM, proposing techniques to obtain semantically accurate reconstructions of the explored environment by combining efficient SLAM systems with state-of-the-art semantic image segmentation algorithms. We show how, by relying on such semantic reconstructions, the accuracy of the localisation phase of a SLAM pipeline can improve, by accounting for the presence of semantic informations during the camera pose estimation step. We thus realise a "semantic loop", where the availability of high level clues betters the mapping process, in turn helping the subsequent localisation phase. A full system, drawing inspiration from the presented research, allowing a real-time and automatic semantic mapping of large-scale environments is then presented. An ancillary, but nevertheless important, component of simultaneous localisation and mapping systems is a technique to allow the estimation of sensor position separately from the main SLAM loop, to recover from failures in the localisation algorithm. We present a technique that, by exploiting the appearance of image patches, can reliably localise the likely position of the sensor used to acquire such images. Such relocalisation system can be easily included in a Semantic SLAM system to allow a more robust mapping process wherein camera tracking failures can be reliably recovered from.

Semantic Slam: A New Paradigm for Object Recognition and Scene Reconstruction

2017

Abstract

Simultaneous localisation and mapping (SLAM) is a technique studied in computer vision and robotics that, given measurements obtained from one or more sensors, allows incremental building of a map of the environment and simultaneous estimation of the position and orientation of the very same sensor used to acquire the input data. Visual SLAM systems typically allow the generation of accurate reconstructions of the explored environment but, until very recently, did not provide high level informations on the contents of the reconstructed scenes, useful to foster high level reasoning by subsequent algorithms. In this thesis we focus on the topic of Semantic SLAM, proposing techniques to obtain semantically accurate reconstructions of the explored environment by combining efficient SLAM systems with state-of-the-art semantic image segmentation algorithms. We show how, by relying on such semantic reconstructions, the accuracy of the localisation phase of a SLAM pipeline can improve, by accounting for the presence of semantic informations during the camera pose estimation step. We thus realise a "semantic loop", where the availability of high level clues betters the mapping process, in turn helping the subsequent localisation phase. A full system, drawing inspiration from the presented research, allowing a real-time and automatic semantic mapping of large-scale environments is then presented. An ancillary, but nevertheless important, component of simultaneous localisation and mapping systems is a technique to allow the estimation of sensor position separately from the main SLAM loop, to recover from failures in the localisation algorithm. We present a technique that, by exploiting the appearance of image patches, can reliably localise the likely position of the sensor used to acquire such images. Such relocalisation system can be easily included in a Semantic SLAM system to allow a more robust mapping process wherein camera tracking failures can be reliably recovered from.
2017
it
File in questo prodotto:
File Dimensione Formato  
Cavallari_Tommaso_Tesi.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati
Dimensione 24.7 MB
Formato Adobe PDF
24.7 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/313696
Il codice NBN di questa tesi è URN:NBN:IT:BNCF-313696