This thesis explores three key aspects of applying Convolutional Neural Networks (CNNs) to Music Information Retrieval (MIR): hyperparameter optimization, data augmentation, and explainability. First, I developed a custom Neural Architecture Search method based on genetic algorithms to optimize CNNs for classifying guitar effect chains, achieving a better balance between accuracy and model compactness than Random Search. Second, I investigated data augmentation for Music Emotion Recognition (MER) on guitar recordings, systematically testing 11 techniques and showing that pitch shifting, time stretching, and time shifting were the most effective without significantly affecting perceived emotion. Finally, I studied explainability in MER by adapting Grad-CAM, SHAP, and LIME to musical spectrograms, and I developed an application that provides multi-level explanations of emotion predictions in guitar improvisations.

Exploring Convolutional Neural Networks for Music Information Retrieval: Neural Architecture Search, Data Augmentation, and Explainability

ROSSI, MICHELE
2026

Abstract

This thesis explores three key aspects of applying Convolutional Neural Networks (CNNs) to Music Information Retrieval (MIR): hyperparameter optimization, data augmentation, and explainability. First, I developed a custom Neural Architecture Search method based on genetic algorithms to optimize CNNs for classifying guitar effect chains, achieving a better balance between accuracy and model compactness than Random Search. Second, I investigated data augmentation for Music Emotion Recognition (MER) on guitar recordings, systematically testing 11 techniques and showing that pitch shifting, time stretching, and time shifting were the most effective without significantly affecting perceived emotion. Finally, I studied explainability in MER by adapting Grad-CAM, SHAP, and LIME to musical spectrograms, and I developed an application that provides multi-level explanations of emotion predictions in guitar improvisations.
2-mag-2026
Inglese
convolutional neural networks
data augmentation
explainability
music emotion recognition
music information retrieval
neural architecture search
Turchet, Luca
Iacca, Giovanni
File in questo prodotto:
File Dimensione Formato  
PHD_Thesis_Michele_Rossi_defpdfA.pdf

accesso aperto

Licenza: Creative Commons
Dimensione 8.62 MB
Formato Adobe PDF
8.62 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/367066
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-367066