Deep Learning (DL) is having a transformational effect in critical areas such as finance, healthcare, transportation, and defense, impacting nearly every aspect of our lives. Many businesses, eager to capitalize on advancements in DL, may have not scrutinized the potential induced security issues of including such intelligent components in their systems. Building a trustworthy DL system requires enforcing key properties, including robustness, privacy, and accountability. This thesis aims to contribute to enhancing DL model’s robustness to input distribution drifts, i.e. situations where training and test distribution differ. Notably, input distribution drifts may happen both naturally — induced by missing input data, e.g. due to some sensor fault — or adversarially, i.e. by an attacker to induce model behavior as desired. Through this thesis, we firstly provide a technique for making DL models robust to missing inputs by design, inducing resilience even in the case of sequential tasks. Then, we propose a detection framework for adversarial attacks accommodating many techniques in literature and novel proposals, as our newer detector exploiting non-linear dimensionality reduction techniques at its core. Finally, abstracting the analyzed defenses in our framework we identified common drawbacks which we propose to overcome with a fast adversarial examples detection technique, capable of a sensible overhead reduction without sacrificing detectors accuracy both on clean data and under attack.

Deep Learning Safety under Non-Stationarity Assumptions

2021

Abstract

Deep Learning (DL) is having a transformational effect in critical areas such as finance, healthcare, transportation, and defense, impacting nearly every aspect of our lives. Many businesses, eager to capitalize on advancements in DL, may have not scrutinized the potential induced security issues of including such intelligent components in their systems. Building a trustworthy DL system requires enforcing key properties, including robustness, privacy, and accountability. This thesis aims to contribute to enhancing DL model’s robustness to input distribution drifts, i.e. situations where training and test distribution differ. Notably, input distribution drifts may happen both naturally — induced by missing input data, e.g. due to some sensor fault — or adversarially, i.e. by an attacker to induce model behavior as desired. Through this thesis, we firstly provide a technique for making DL models robust to missing inputs by design, inducing resilience even in the case of sequential tasks. Then, we propose a detection framework for adversarial attacks accommodating many techniques in literature and novel proposals, as our newer detector exploiting non-linear dimensionality reduction techniques at its core. Finally, abstracting the analyzed defenses in our framework we identified common drawbacks which we propose to overcome with a fast adversarial examples detection technique, capable of a sensible overhead reduction without sacrificing detectors accuracy both on clean data and under attack.
24-apr-2021
Italiano
Bacciu, Davide
Biggio, Battista
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
diss.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 10.82 MB
Formato Adobe PDF
10.82 MB Adobe PDF Visualizza/Apri
relazione.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 132.01 kB
Formato Adobe PDF
132.01 kB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/148294
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-148294