Nowadays, object detection and instance segmentation are two of the most studied topics in the computer vision community, because they reflect one of the key problems for many of the existing applications, when we have to deal with many heterogeneous objects inside an image. This thesis deals with some important aspects of these two tasks in multiple settings: Supervised Learning, Self-Supervised and Semi-Supervised Learning. We will go in details and tackle multiple intrinsic imbalance problems of current models, defining new tasks and new architectures to improve the general performance. First, we introduce GRoIE, a novel Region of Interest (RoI) extraction layer, to address the problem called Feature Level Imbalance (FLI) on a Feature Pyramid Network (FPN). Then, we propose an empirical analysis on a new model head, called FCC, in supports of an emerging rule to make the best architectural choices depending on the task to solve. In addition, we addressed the IoU Distribution Imbalance (IDI) problem with a loop architecture, called $R^3$-CNN, in contrast to the recent HTC cascade network. After that, we introduce the new architecture, called SBR-CNN, which meshes all this architecture improvements, proving to be able to maintain its qualities if plugged into major state-of-the-art models. We also define a new auxiliary self-learning task $C^2SSL$ with the purpose of enhancing the instance segmentation training on special case of vines diseases detection and segmentation. Then, introducing a Semi-Supervised Learning setting, we propose multiple improvements on the Teacher-Student model for the Object Detection task (IL-net). Finally, we define two new datasets called Leaf Diseases Dataset (LDD), to make instance segmentation of leaf, grapes and the related diseases, and ADIDAS Social Network Dataset (ASND), to make object detection of clothes in images coming from social networks.
Object detection and instance segmentation with deep learning techniques
Leonardo, Rossi
2022
Abstract
Nowadays, object detection and instance segmentation are two of the most studied topics in the computer vision community, because they reflect one of the key problems for many of the existing applications, when we have to deal with many heterogeneous objects inside an image. This thesis deals with some important aspects of these two tasks in multiple settings: Supervised Learning, Self-Supervised and Semi-Supervised Learning. We will go in details and tackle multiple intrinsic imbalance problems of current models, defining new tasks and new architectures to improve the general performance. First, we introduce GRoIE, a novel Region of Interest (RoI) extraction layer, to address the problem called Feature Level Imbalance (FLI) on a Feature Pyramid Network (FPN). Then, we propose an empirical analysis on a new model head, called FCC, in supports of an emerging rule to make the best architectural choices depending on the task to solve. In addition, we addressed the IoU Distribution Imbalance (IDI) problem with a loop architecture, called $R^3$-CNN, in contrast to the recent HTC cascade network. After that, we introduce the new architecture, called SBR-CNN, which meshes all this architecture improvements, proving to be able to maintain its qualities if plugged into major state-of-the-art models. We also define a new auxiliary self-learning task $C^2SSL$ with the purpose of enhancing the instance segmentation training on special case of vines diseases detection and segmentation. Then, introducing a Semi-Supervised Learning setting, we propose multiple improvements on the Teacher-Student model for the Object Detection task (IL-net). Finally, we define two new datasets called Leaf Diseases Dataset (LDD), to make instance segmentation of leaf, grapes and the related diseases, and ADIDAS Social Network Dataset (ASND), to make object detection of clothes in images coming from social networks.File | Dimensione | Formato | |
---|---|---|---|
PhDThesis-dspace-final.pdf
accesso aperto
Dimensione
22.85 MB
Formato
Adobe PDF
|
22.85 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/193245
URN:NBN:IT:UNIPR-193245