Content-based image retrieval for visual big data analysis

Magliani, Federico

The Content-Based Image Retrieval (CBIR) task is a computer vision problem. The growth of the digital images on the Internet allows to encourage the proposal of solution for this task more than before. The access to this huge quantity of data has allowed the creation of big datasets, that brought with them lots of new challenges. Briefly, the objective of the task is simply to retrieve and rank the similar images to the query one, called retrieval accuracy, that need to be as high as possible. Moreover, there are secondary targets as retrieval time and memory occupancy that need to be as low as possible. The problem is trivial for humans that simply execute this task through experience and semantic perception, but it is not so easy for a computer. This is known as semantic gap, which refers to the gap between low-level image pixels and high-level semantic concepts. Furthermore, the images may contain noisy patches (e.g. trees, person, cars, ...), be taken with different lightning conditions, viewpoints and resolution. In order to solve this problem it is crucial to develop algorithms and techniques with the objective of reducing the weight of the unnecessary patches of the images and that work well with a vast quantity of data. There are several applications of CBIR systems: libraries and museum applications, fashion application for the search of certain clothes, advanced electronic tourist guides. In this thesis a complete pipeline for the resolution of the CBIR problem is presented and then all the steps of the process are evaluated with a particular focus on CNN transfer learning, embeddings, large-scale retrieval and methods based on graphs as diffusion mechanism. All the methods presented are tested on several public image datasets in order to compare the final retrieval results.

Content-based image retrieval for visual big data analysis

Magliani, Federico

2020

Abstract

The Content-Based Image Retrieval (CBIR) task is a computer vision problem. The growth of the digital images on the Internet allows to encourage the proposal of solution for this task more than before. The access to this huge quantity of data has allowed the creation of big datasets, that brought with them lots of new challenges. Briefly, the objective of the task is simply to retrieve and rank the similar images to the query one, called retrieval accuracy, that need to be as high as possible. Moreover, there are secondary targets as retrieval time and memory occupancy that need to be as low as possible. The problem is trivial for humans that simply execute this task through experience and semantic perception, but it is not so easy for a computer. This is known as semantic gap, which refers to the gap between low-level image pixels and high-level semantic concepts. Furthermore, the images may contain noisy patches (e.g. trees, person, cars, ...), be taken with different lightning conditions, viewpoints and resolution. In order to solve this problem it is crucial to develop algorithms and techniques with the objective of reducing the weight of the unnecessary patches of the images and that work well with a vast quantity of data. There are several applications of CBIR systems: libraries and museum applications, fashion application for the search of certain clothes, advanced electronic tourist guides. In this thesis a complete pipeline for the resolution of the CBIR problem is presented and then all the steps of the process are evaluated with a particular focus on CNN transfer learning, embeddings, large-scale retrieval and methods based on graphs as diffusion mechanism. All the methods presented are tested on several public image datasets in order to compare the final retrieval results.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Dottorato di ricerca in Tecnologie dell'informazione
			
	Data di pubblicazione
	
				mar-2020
			
	Lingua
	
				Italiano
			
	Parola chiave
	
				content-based image retrieval
LSH
R-MAC+
locVLAD
Bag of Indexes
LSH kNN graph
ING.INF./05
			
	Nome Editore
	
				Università degli Studi di Parma
			
	Collezione di appartenenza
	
				Università degli Studi di Parma

File in questo prodotto:

File	Dimensione	Formato
relazione-finale-schema.pdf accesso solo da BNCF e BNCR Tipologia: Altro materiale allegato Dimensione 64.89 kB Formato Adobe PDF	64.89 kB	Adobe PDF
TesiDottorato.pdf accesso solo da BNCF e BNCR Tipologia: Altro materiale allegato Dimensione 4.67 MB Formato Adobe PDF	4.67 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/134450

Il codice NBN di questa tesi è URN:NBN:IT:UNIPR-134450