The First Person Vision (FPV) paradigm allows to seamlessly acquire images of the world from the user's perspective. Compared to standard Third Person Vision, FPV is advantageous for building intelligent wearable systems able to assist the user and augment his abilities. Given their intrinsic mobility and the ability to acquire user-related information, FPV systems have to deal with a continuously evolving environment. Moving from the observation that data acquired from a first person perspective is highly personal, we investigate contextual awareness for First Person Vision systems. We first focus on the task of recognizing personal locations of interest from egocentric videos. We consider personal locations at the instance level and address the problem of rejecting locations not of interest for the user. To challenge the problem, we introduce three datasets of 10 personal locations which we make publicly available, and perform a benchmark of different wearable devices and state-of-the-art representations. Moreover, we propose and evaluate methods to reject negative locations and perform personal location-based temporal segmentation of egocentric videos. As a second aspect, we investigate the anticipation of object interaction. We propose and define the task of next-active-object prediction as recognizing which objects are going to be interacted with, before the actual interaction begins. Even if recognizing next-active-objects is in general not trivial in unconstrained settings, we show that the First Person Vision paradigm provides useful cues to address the challenge. We propose a next-active-object prediction method based on the analysis of egocentric object trajectories and assess its superior performances with respect to other cues such as object appearance, distance from the center of the frame, presence of hands and visual saliency. In appendix, we also report some investigations on extraction features directly from wide angle images.
Context Awareness in First Person Vision
FURNARI, ANTONINO
2017
Abstract
The First Person Vision (FPV) paradigm allows to seamlessly acquire images of the world from the user's perspective. Compared to standard Third Person Vision, FPV is advantageous for building intelligent wearable systems able to assist the user and augment his abilities. Given their intrinsic mobility and the ability to acquire user-related information, FPV systems have to deal with a continuously evolving environment. Moving from the observation that data acquired from a first person perspective is highly personal, we investigate contextual awareness for First Person Vision systems. We first focus on the task of recognizing personal locations of interest from egocentric videos. We consider personal locations at the instance level and address the problem of rejecting locations not of interest for the user. To challenge the problem, we introduce three datasets of 10 personal locations which we make publicly available, and perform a benchmark of different wearable devices and state-of-the-art representations. Moreover, we propose and evaluate methods to reject negative locations and perform personal location-based temporal segmentation of egocentric videos. As a second aspect, we investigate the anticipation of object interaction. We propose and define the task of next-active-object prediction as recognizing which objects are going to be interacted with, before the actual interaction begins. Even if recognizing next-active-objects is in general not trivial in unconstrained settings, we show that the First Person Vision paradigm provides useful cues to address the challenge. We propose a next-active-object prediction method based on the analysis of egocentric object trajectories and assess its superior performances with respect to other cues such as object appearance, distance from the center of the frame, presence of hands and visual saliency. In appendix, we also report some investigations on extraction features directly from wide angle images.File | Dimensione | Formato | |
---|---|---|---|
PhDThesisFurnari.pdf
accesso aperto
Dimensione
74.53 MB
Formato
Adobe PDF
|
74.53 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/75863
URN:NBN:IT:UNICT-75863