Combining declarative models and computer vision recognition algorithms for stroke gestures

Carcangiu, Alessandro

The consumer-level devices that track the user’s gestures eased the design and the implementation of interactive applications relying on body movements as input. Gesture recognition based on computer vision and machine-learning focuses mainly on accuracy and robustness. The resulting classifiers label precisely gestures after their performance, but they do not provide intermediate information during the execution. Human-Computer Interaction research focused instead on providing an easy and effective guidance for performing and discovering interactive gestures. The compositional approaches developed for solving such problem provide information on both the whole gesture and on its sub-parts, but they exploit heuristic techniques that have a low recognition accuracy. In this thesis, we introduce two methods, DEICTIC and G-Gene, designed for establishing a compromise between the accuracy and the provided information. DEICTIC exploits a compositional and declarative description for stroke gestures. It uses basic Hidden Markov Models (HMMs) to recognise meaningful predefined primitives (gesture sub-parts) and it composes them to recognise complex gestures. It provides information for supporting gesture guidance and it reaches an accuracy comparable with state-of-the-art approaches on two datasets from the literature. The normalization of the gesture samples limits the online recognition in the general case. Instead, G-Gene is a method for transforming compositional stroke gesture definitions into profile Hidden Markov Models (HMMs), able to provide both a good accuracy and information on gesture sub-parts. It supports online recognition without using any global feature, and it updates the information while receiving the input stream, with an accuracy useful for prototyping the interaction. We evaluated both approaches in a simplified development task with real developers, showing that they require less time and an effort comparable to compositional approaches, while the definition procedure and the perceived recognition accuracy is comparable to machine learning.

Combining declarative models and computer vision recognition algorithms for stroke gestures

CARCANGIU, ALESSANDRO

2019

Abstract

The consumer-level devices that track the user’s gestures eased the design and the implementation of interactive applications relying on body movements as input. Gesture recognition based on computer vision and machine-learning focuses mainly on accuracy and robustness. The resulting classifiers label precisely gestures after their performance, but they do not provide intermediate information during the execution. Human-Computer Interaction research focused instead on providing an easy and effective guidance for performing and discovering interactive gestures. The compositional approaches developed for solving such problem provide information on both the whole gesture and on its sub-parts, but they exploit heuristic techniques that have a low recognition accuracy. In this thesis, we introduce two methods, DEICTIC and G-Gene, designed for establishing a compromise between the accuracy and the provided information. DEICTIC exploits a compositional and declarative description for stroke gestures. It uses basic Hidden Markov Models (HMMs) to recognise meaningful predefined primitives (gesture sub-parts) and it composes them to recognise complex gestures. It provides information for supporting gesture guidance and it reaches an accuracy comparable with state-of-the-art approaches on two datasets from the literature. The normalization of the gesture samples limits the online recognition in the general case. Instead, G-Gene is a method for transforming compositional stroke gesture definitions into profile Hidden Markov Models (HMMs), able to provide both a good accuracy and information on gesture sub-parts. It supports online recognition without using any global feature, and it updates the information while receiving the input stream, with an accuracy useful for prototyping the interaction. We evaluated both approaches in a simplified development task with real developers, showing that they require less time and an effort comparable to compositional approaches, while the definition procedure and the perceived recognition accuracy is comparable to machine learning.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				INGEGNERIA ELETTRONICA E INFORMATICA
			
	Data di pubblicazione
	
				8-feb-2019
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				ROLI, FABIO
SPANO, LUCIO DAVIDE
			
	Nome Editore
	
				Università degli Studi di Cagliari
			
	Collezione di appartenenza
	
				Università degli Studi di Cagliari

File in questo prodotto:

File	Dimensione	Formato
tesi di dottorato_Alessandro Carcangiu.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 6.61 MB Formato Adobe PDF Visualizza/Apri	6.61 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/70666

Il codice NBN di questa tesi è URN:NBN:IT:UNICA-70666