The Content-Based Image Retrieval (CBIR) task is a computer vision problem. The growth of the digital images on the Internet allows to encourage the proposal of solution for this task more than before. The access to this huge quantity of data has allowed the creation of big datasets, that brought with them lots of new challenges. Briefly, the objective of the task is simply to retrieve and rank the similar images to the query one, called retrieval accuracy, that need to be as high as possible. Moreover, there are secondary targets as retrieval time and memory occupancy that need to be as low as possible. The problem is trivial for humans that simply execute this task through experience and semantic perception, but it is not so easy for a computer. This is known as semantic gap, which refers to the gap between low-level image pixels and high-level semantic concepts. Furthermore, the images may contain noisy patches (e.g. trees, person, cars, ...), be taken with different lightning conditions, viewpoints and resolution. In order to solve this problem it is crucial to develop algorithms and techniques with the objective of reducing the weight of the unnecessary patches of the images and that work well with a vast quantity of data. There are several applications of CBIR systems: libraries and museum applications, fashion application for the search of certain clothes, advanced electronic tourist guides. In this thesis a complete pipeline for the resolution of the CBIR problem is presented and then all the steps of the process are evaluated with a particular focus on CNN transfer learning, embeddings, large-scale retrieval and methods based on graphs as diffusion mechanism. All the methods presented are tested on several public image datasets in order to compare the final retrieval results.

Content-based image retrieval for visual big data analysis

2020

Abstract

The Content-Based Image Retrieval (CBIR) task is a computer vision problem. The growth of the digital images on the Internet allows to encourage the proposal of solution for this task more than before. The access to this huge quantity of data has allowed the creation of big datasets, that brought with them lots of new challenges. Briefly, the objective of the task is simply to retrieve and rank the similar images to the query one, called retrieval accuracy, that need to be as high as possible. Moreover, there are secondary targets as retrieval time and memory occupancy that need to be as low as possible. The problem is trivial for humans that simply execute this task through experience and semantic perception, but it is not so easy for a computer. This is known as semantic gap, which refers to the gap between low-level image pixels and high-level semantic concepts. Furthermore, the images may contain noisy patches (e.g. trees, person, cars, ...), be taken with different lightning conditions, viewpoints and resolution. In order to solve this problem it is crucial to develop algorithms and techniques with the objective of reducing the weight of the unnecessary patches of the images and that work well with a vast quantity of data. There are several applications of CBIR systems: libraries and museum applications, fashion application for the search of certain clothes, advanced electronic tourist guides. In this thesis a complete pipeline for the resolution of the CBIR problem is presented and then all the steps of the process are evaluated with a particular focus on CNN transfer learning, embeddings, large-scale retrieval and methods based on graphs as diffusion mechanism. All the methods presented are tested on several public image datasets in order to compare the final retrieval results.
mar-2020
Italiano
content-based image retrieval
LSH
R-MAC+
locVLAD
Bag of Indexes
LSH kNN graph
ING.INF./05
Università degli Studi di Parma
File in questo prodotto:
File Dimensione Formato  
relazione-finale-schema.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Dimensione 64.89 kB
Formato Adobe PDF
64.89 kB Adobe PDF
TesiDottorato.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Dimensione 4.67 MB
Formato Adobe PDF
4.67 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/134450
Il codice NBN di questa tesi è URN:NBN:IT:UNIPR-134450