What this thesis proposes is a new type of crowd analysis in computer vision, focused on the spectator crowd, that is, people "interested in watching something specific that they came to see". Typical scenarios of spectator crowds are stadiums, amphitheaters, classrooms, etc., and they share some aspects with classical crowd monitoring; for instance, since many people are simultaneously observed, per-person analysis is hard; however, in the considered cases, the dynamics of humans is more constrained, due to the architectural environment in which they are situated; specifically, people are expected to stay in a fixed location most of the time, limiting their activities to applaud, watch, support/heckle the players or discuss with the neighbors. We start facing this challenge by following a social signal processing approach, which grounds computer vision techniques in social theories. More specifically, leveraging on social theories describing expressive bodily conduct, we will show interesting results on how it is possible to distinguish people behaviors by automatically detecting their social activities. In particular, we propose a novel dataset, the Spectators Hockey (S-Hock), which deals with 4 hockey matches recorded during an international tournament. A massive annotation has been carried out on the dataset, focusing on the spectators at different levels of detail: at a higher level, people have been labeled depending on the team they were supporting and on the acquaintance they have with spectators who sit close to them; going to the lower levels, standard pose information has been considered (regarding the head, the body), but also fine grained actions such as hands on hips, clapping hands, etc. The labeling has also been focused on the game field, allowing to relate what was going on in the match with the crowd behavior. This brought to more than 100 millions of annotations, useful for standard lowlevel applications as object counting, people detection and head pose estimation, but also for high-level tasks, as spectator categorization and event recognition. For all of these we provide protocols and baseline results, encouraging further research. In this general picture, this thesis has been devoted to demonstrate that a strong sociological background is necessary to deal with crowd analysis in general, but also to underline the need to explore a novel specific issue, namely spectator crowd, by developing approaches able to adapt to the peculiarities of this scenario, which is new in computer vision. We are confident that S-Hock and our studies may trigger the design of novel and effective approaches for the analysis of human behavior in crowded settings and environments.

Spectator crowd: a social signal processing perspective

CONIGLIARO, Davide
2016

Abstract

What this thesis proposes is a new type of crowd analysis in computer vision, focused on the spectator crowd, that is, people "interested in watching something specific that they came to see". Typical scenarios of spectator crowds are stadiums, amphitheaters, classrooms, etc., and they share some aspects with classical crowd monitoring; for instance, since many people are simultaneously observed, per-person analysis is hard; however, in the considered cases, the dynamics of humans is more constrained, due to the architectural environment in which they are situated; specifically, people are expected to stay in a fixed location most of the time, limiting their activities to applaud, watch, support/heckle the players or discuss with the neighbors. We start facing this challenge by following a social signal processing approach, which grounds computer vision techniques in social theories. More specifically, leveraging on social theories describing expressive bodily conduct, we will show interesting results on how it is possible to distinguish people behaviors by automatically detecting their social activities. In particular, we propose a novel dataset, the Spectators Hockey (S-Hock), which deals with 4 hockey matches recorded during an international tournament. A massive annotation has been carried out on the dataset, focusing on the spectators at different levels of detail: at a higher level, people have been labeled depending on the team they were supporting and on the acquaintance they have with spectators who sit close to them; going to the lower levels, standard pose information has been considered (regarding the head, the body), but also fine grained actions such as hands on hips, clapping hands, etc. The labeling has also been focused on the game field, allowing to relate what was going on in the match with the crowd behavior. This brought to more than 100 millions of annotations, useful for standard lowlevel applications as object counting, people detection and head pose estimation, but also for high-level tasks, as spectator categorization and event recognition. For all of these we provide protocols and baseline results, encouraging further research. In this general picture, this thesis has been devoted to demonstrate that a strong sociological background is necessary to deal with crowd analysis in general, but also to underline the need to explore a novel specific issue, namely spectator crowd, by developing approaches able to adapt to the peculiarities of this scenario, which is new in computer vision. We are confident that S-Hock and our studies may trigger the design of novel and effective approaches for the analysis of human behavior in crowded settings and environments.
2016
Inglese
spectator crowd, crowd analysis, crowd dataset, object counting, spectator categorization, people detection, head pose estimation, ontology
139
File in questo prodotto:
File Dimensione Formato  
thesis_Conigliaro.pdf

accesso solo da BNCF e BNCR

Dimensione 32.97 MB
Formato Adobe PDF
32.97 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/181665
Il codice NBN di questa tesi è URN:NBN:IT:UNIVR-181665