The First Person Vision (FPV) paradigm allows to seamlessly acquire images of the world from the user's perspective. Compared to standard Third Person Vision, FPV is advantageous for building intelligent wearable systems able to assist the user and augment his abilities. Given their intrinsic mobility and the ability to acquire user-related information, FPV systems have to deal with a continuously evolving environment. Moving from the observation that data acquired from a first person perspective is highly personal, we investigate contextual awareness for First Person Vision systems. We first focus on the task of recognizing personal locations of interest from egocentric videos. We consider personal locations at the instance level and address the problem of rejecting locations not of interest for the user. To challenge the problem, we introduce three datasets of 10 personal locations which we make publicly available, and perform a benchmark of different wearable devices and state-of-the-art representations. Moreover, we propose and evaluate methods to reject negative locations and perform personal location-based temporal segmentation of egocentric videos. As a second aspect, we investigate the anticipation of object interaction. We propose and define the task of next-active-object prediction as recognizing which objects are going to be interacted with, before the actual interaction begins. Even if recognizing next-active-objects is in general not trivial in unconstrained settings, we show that the First Person Vision paradigm provides useful cues to address the challenge. We propose a next-active-object prediction method based on the analysis of egocentric object trajectories and assess its superior performances with respect to other cues such as object appearance, distance from the center of the frame, presence of hands and visual saliency. In appendix, we also report some investigations on extraction features directly from wide angle images.

Context Awareness in First Person Vision

FURNARI, ANTONINO
2017

Abstract

The First Person Vision (FPV) paradigm allows to seamlessly acquire images of the world from the user's perspective. Compared to standard Third Person Vision, FPV is advantageous for building intelligent wearable systems able to assist the user and augment his abilities. Given their intrinsic mobility and the ability to acquire user-related information, FPV systems have to deal with a continuously evolving environment. Moving from the observation that data acquired from a first person perspective is highly personal, we investigate contextual awareness for First Person Vision systems. We first focus on the task of recognizing personal locations of interest from egocentric videos. We consider personal locations at the instance level and address the problem of rejecting locations not of interest for the user. To challenge the problem, we introduce three datasets of 10 personal locations which we make publicly available, and perform a benchmark of different wearable devices and state-of-the-art representations. Moreover, we propose and evaluate methods to reject negative locations and perform personal location-based temporal segmentation of egocentric videos. As a second aspect, we investigate the anticipation of object interaction. We propose and define the task of next-active-object prediction as recognizing which objects are going to be interacted with, before the actual interaction begins. Even if recognizing next-active-objects is in general not trivial in unconstrained settings, we show that the First Person Vision paradigm provides useful cues to address the challenge. We propose a next-active-object prediction method based on the analysis of egocentric object trajectories and assess its superior performances with respect to other cues such as object appearance, distance from the center of the frame, presence of hands and visual saliency. In appendix, we also report some investigations on extraction features directly from wide angle images.
28-gen-2017
Inglese
BATTIATO, SEBASTIANO
RUSSO, Giovanni
Università degli studi di Catania
Catania
File in questo prodotto:
File Dimensione Formato  
PhDThesisFurnari.pdf

accesso aperto

Dimensione 74.53 MB
Formato Adobe PDF
74.53 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/75863
Il codice NBN di questa tesi è URN:NBN:IT:UNICT-75863