Recent years have seen an explosion in multimodal data on the web. It is therefore important to perform multimodal learning to understand the web. However, it is challenging to join various modalities because each modality has a different representation and correlational structure. In addition, various modalities generally carry different kinds of information that may provide enrich understanding; for example, the visual signal of a flower may provide happiness; however, its scent might not be pleasant. Multimodal information may be useful to make an informed decision. Therefore, we focus on improving representations from individual modalities to enhance multimodal representation and learning. In this doctoral thesis, we presented techniques to enhance representations from individual and multiple modalities for multimodal applications including classification, cross-modal retrieval, matching and verification on various benchmark datasets.

Multimodal representation and learning

NAWAZ, SHAH
2019

Abstract

Recent years have seen an explosion in multimodal data on the web. It is therefore important to perform multimodal learning to understand the web. However, it is challenging to join various modalities because each modality has a different representation and correlational structure. In addition, various modalities generally carry different kinds of information that may provide enrich understanding; for example, the visual signal of a flower may provide happiness; however, its scent might not be pleasant. Multimodal information may be useful to make an informed decision. Therefore, we focus on improving representations from individual modalities to enhance multimodal representation and learning. In this doctoral thesis, we presented techniques to enhance representations from individual and multiple modalities for multimodal applications including classification, cross-modal retrieval, matching and verification on various benchmark datasets.
2019
Inglese
Multimodal, cross-modal retrieval, cross-modal matching, cross-modal verification
GALLO, IGNAZIO
Università degli Studi dell'Insubria
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_NawazShah_completa.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 4.19 MB
Formato Adobe PDF
4.19 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/300301
Il codice NBN di questa tesi è URN:NBN:IT:UNINSUBRIA-300301