Semantic Image Interpretation - Integration of Numerical Data and Logical Knowledge for Cognitive Vision

Donadello, Ivan

Semantic Image Interpretation (SII) is the process of generating a structured description of the content of an input image. This description is encoded as a labelled direct graph where nodes correspond to objects in the image and edges to semantic relations between objects. Such a detailed structure allows a more accurate searching and retrieval of images. In this thesis, we propose two well-founded methods for SII. Both methods exploit background knowledge, in the form of logical constraints of a knowledge base, about the domain of the images. The first method formalizes the SII as the extraction of a partial model of a knowledge base. Partial models are built with a clustering and reasoning algorithm that considers both low-level and semantic features of images. The second method uses the framework Logic Tensor Networks to build the labelled direct graph of an image. This framework is able to learn from data in presence of the logical constraints of the knowledge base. Therefore, the graph construction is performed by predicting the labels of the nodes and the relations according to the logical constraints and the features of the objects in the image. These methods improve the state-of-the-art by introducing two well-founded methodologies that integrate low-level and semantic features of images with logical knowledge. Indeed, other methods, do not deal with low-level features or use only statistical knowledge coming from training sets or corpora. Moreover, the second method overcomes the performance of the state-of-the-art on the standard task of visual relationship detection.

Semantic Image Interpretation - Integration of Numerical Data and Logical Knowledge for Cognitive Vision

Donadello, Ivan

2018

Abstract

Semantic Image Interpretation (SII) is the process of generating a structured description of the content of an input image. This description is encoded as a labelled direct graph where nodes correspond to objects in the image and edges to semantic relations between objects. Such a detailed structure allows a more accurate searching and retrieval of images. In this thesis, we propose two well-founded methods for SII. Both methods exploit background knowledge, in the form of logical constraints of a knowledge base, about the domain of the images. The first method formalizes the SII as the extraction of a partial model of a knowledge base. Partial models are built with a clustering and reasoning algorithm that considers both low-level and semantic features of images. The second method uses the framework Logic Tensor Networks to build the labelled direct graph of an image. This framework is able to learn from data in presence of the logical constraints of the knowledge base. Therefore, the graph construction is performed by predicting the labels of the nodes and the relations according to the logical constraints and the features of the objects in the image. These methods improve the state-of-the-art by introducing two well-founded methodologies that integrate low-level and semantic features of images with logical knowledge. Indeed, other methods, do not deal with low-level features or use only statistical knowledge coming from training sets or corpora. Moreover, the second method overcomes the performance of the state-of-the-art on the standard task of visual relationship detection.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di studio
	
				Information and Communication Technology
			
	Data di pubblicazione
	
				2018
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Serafini, Luciano
			
	Nome Editore
	
				Università degli studi di Trento
			
	Città Editore
	
				TRENTO
			
	Numero di pagine
	
				125
			
	Collezione di appartenenza
	
				Università degli Studi di Trento

File in questo prodotto:

File	Dimensione	Formato
PhD-Thesis.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 4.61 MB Formato Adobe PDF Visualizza/Apri	4.61 MB	Adobe PDF	Visualizza/Apri
Disclaimer_Donadello.pdf accesso solo da BNCF e BNCR Licenza: Tutti i diritti riservati Dimensione 907.07 kB Formato Adobe PDF	907.07 kB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/177945

Il codice NBN di questa tesi è URN:NBN:IT:UNITN-177945