Generating useful knowledge out of personal big data in form of sensor streams is a difficult task that presents multiple challenges due to the intrinsic characteristics of these type of data, namely their volume, velocity, variety and noisiness. This problem is a well-known long standing problem in computer science called the Semantic Gap Problem. It was originally defined in the research area of image processing as "... the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation..." [Smeulders et al., 2000]. In the context of this work, the lack of coincidence is between low-level raw streaming sensor data collected by sensors in a machine-readable format and higher-level semantic knowledge that can be generated from these data and that only humans can understand thanks to their intelligence, habits and routines. This thesis addresses the semantic gap problem in the context above, proposing an interdisciplinary approach able to generate human level knowledge from streaming sensor data in open domains. It leverages on two different research fields: one regarding the collection, management and analysis of big data and the field of semantic computing, focused on ontologies, which respectively map to the two elements of the semantic gap mentioned above. The contributions of this thesis are: • The definition of a methodology based on the idea that the user and the world surrounding him can be modeled, defining most of the elements of her context as entities (locations, people, objects, among other, and the relations among them) in addition with the attributes for all of them. The modeling aspects of this ontology are outside of the scope of this work. Having such a structure, the task of bridging the semantic gap is divided in many, less complex, modular and compositional micro-tasks that are which consist in mapping the streaming sensor data using contextual information to the attribute values of the corresponding entities. In this way we can create a structure out of the unstructured, noisy and highly variable sensor data that can then be used by the machine to provide personalized, context-aware services to the final user; • The definition of a reference architecture that applies the methodology above and addresses the semantic gap problem in streaming sensor data; • The instantiation of the architecture above in the Stream Base System (SB), resulting in the implementation of its main components using state-of-the-art software solutions and technologies; • The adoption of the Stream Base System in four use cases that have very different objectives one respect to the other, proving that it works in open domains.

Bridging Sensor Data Streams and Human Knowledge

Zeni, Mattia
2017

Abstract

Generating useful knowledge out of personal big data in form of sensor streams is a difficult task that presents multiple challenges due to the intrinsic characteristics of these type of data, namely their volume, velocity, variety and noisiness. This problem is a well-known long standing problem in computer science called the Semantic Gap Problem. It was originally defined in the research area of image processing as "... the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation..." [Smeulders et al., 2000]. In the context of this work, the lack of coincidence is between low-level raw streaming sensor data collected by sensors in a machine-readable format and higher-level semantic knowledge that can be generated from these data and that only humans can understand thanks to their intelligence, habits and routines. This thesis addresses the semantic gap problem in the context above, proposing an interdisciplinary approach able to generate human level knowledge from streaming sensor data in open domains. It leverages on two different research fields: one regarding the collection, management and analysis of big data and the field of semantic computing, focused on ontologies, which respectively map to the two elements of the semantic gap mentioned above. The contributions of this thesis are: • The definition of a methodology based on the idea that the user and the world surrounding him can be modeled, defining most of the elements of her context as entities (locations, people, objects, among other, and the relations among them) in addition with the attributes for all of them. The modeling aspects of this ontology are outside of the scope of this work. Having such a structure, the task of bridging the semantic gap is divided in many, less complex, modular and compositional micro-tasks that are which consist in mapping the streaming sensor data using contextual information to the attribute values of the corresponding entities. In this way we can create a structure out of the unstructured, noisy and highly variable sensor data that can then be used by the machine to provide personalized, context-aware services to the final user; • The definition of a reference architecture that applies the methodology above and addresses the semantic gap problem in streaming sensor data; • The instantiation of the architecture above in the Stream Base System (SB), resulting in the implementation of its main components using state-of-the-art software solutions and technologies; • The adoption of the Stream Base System in four use cases that have very different objectives one respect to the other, proving that it works in open domains.
2017
Inglese
Giunchiglia, Fausto
Università degli studi di Trento
TRENTO
181
File in questo prodotto:
File Dimensione Formato  
Disclaimer_Zeni.pdf

accesso aperto

Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF Visualizza/Apri
Thesis.pdf

accesso solo da BNCF e BNCR

Dimensione 8.27 MB
Formato Adobe PDF
8.27 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/90817
Il codice NBN di questa tesi è URN:NBN:IT:UNITN-90817