The consumer-level devices that track the user’s gestures eased the design and the implementation of interactive applications relying on body movements as input. Gesture recognition based on computer vision and machine-learning focuses mainly on accuracy and robustness. The resulting classifiers label precisely gestures after their performance, but they do not provide intermediate information during the execution. Human-Computer Interaction research focused instead on providing an easy and effective guidance for performing and discovering interactive gestures. The compositional approaches developed for solving such problem provide information on both the whole gesture and on its sub-parts, but they exploit heuristic techniques that have a low recognition accuracy. In this thesis, we introduce two methods, DEICTIC and G-Gene, designed for establishing a compromise between the accuracy and the provided information. DEICTIC exploits a compositional and declarative description for stroke gestures. It uses basic Hidden Markov Models (HMMs) to recognise meaningful predefined primitives (gesture sub-parts) and it composes them to recognise complex gestures. It provides information for supporting gesture guidance and it reaches an accuracy comparable with state-of-the-art approaches on two datasets from the literature. The normalization of the gesture samples limits the online recognition in the general case. Instead, G-Gene is a method for transforming compositional stroke gesture definitions into profile Hidden Markov Models (HMMs), able to provide both a good accuracy and information on gesture sub-parts. It supports online recognition without using any global feature, and it updates the information while receiving the input stream, with an accuracy useful for prototyping the interaction. We evaluated both approaches in a simplified development task with real developers, showing that they require less time and an effort comparable to compositional approaches, while the definition procedure and the perceived recognition accuracy is comparable to machine learning.

Combining declarative models and computer vision recognition algorithms for stroke gestures

CARCANGIU, ALESSANDRO
2019

Abstract

The consumer-level devices that track the user’s gestures eased the design and the implementation of interactive applications relying on body movements as input. Gesture recognition based on computer vision and machine-learning focuses mainly on accuracy and robustness. The resulting classifiers label precisely gestures after their performance, but they do not provide intermediate information during the execution. Human-Computer Interaction research focused instead on providing an easy and effective guidance for performing and discovering interactive gestures. The compositional approaches developed for solving such problem provide information on both the whole gesture and on its sub-parts, but they exploit heuristic techniques that have a low recognition accuracy. In this thesis, we introduce two methods, DEICTIC and G-Gene, designed for establishing a compromise between the accuracy and the provided information. DEICTIC exploits a compositional and declarative description for stroke gestures. It uses basic Hidden Markov Models (HMMs) to recognise meaningful predefined primitives (gesture sub-parts) and it composes them to recognise complex gestures. It provides information for supporting gesture guidance and it reaches an accuracy comparable with state-of-the-art approaches on two datasets from the literature. The normalization of the gesture samples limits the online recognition in the general case. Instead, G-Gene is a method for transforming compositional stroke gesture definitions into profile Hidden Markov Models (HMMs), able to provide both a good accuracy and information on gesture sub-parts. It supports online recognition without using any global feature, and it updates the information while receiving the input stream, with an accuracy useful for prototyping the interaction. We evaluated both approaches in a simplified development task with real developers, showing that they require less time and an effort comparable to compositional approaches, while the definition procedure and the perceived recognition accuracy is comparable to machine learning.
8-feb-2019
Inglese
ROLI, FABIO
SPANO, LUCIO DAVIDE
Università degli Studi di Cagliari
File in questo prodotto:
File Dimensione Formato  
tesi di dottorato_Alessandro Carcangiu.pdf

accesso aperto

Dimensione 6.61 MB
Formato Adobe PDF
6.61 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/70666
Il codice NBN di questa tesi è URN:NBN:IT:UNICA-70666