Deep Learning (DL) is having a transformational effect in critical areas such as finance, healthcare, transportation, and defense, impacting nearly every aspect of our lives. Many businesses, eager to capitalize on advancements in DL, may have not scrutinized the potential induced security issues of including such intelligent components in their systems. Building a trustworthy DL system requires enforcing key properties, including robustness, privacy, and accountability. This thesis aims to contribute to enhancing DL model’s robustness to input distribution drifts, i.e. situations where training and test distribution differ. Notably, input distribution drifts may happen both naturally — induced by missing input data, e.g. due to some sensor fault — or adversarially, i.e. by an attacker to induce model behavior as desired. Through this thesis, we firstly provide a technique for making DL models robust to missing inputs by design, inducing resilience even in the case of sequential tasks. Then, we propose a detection framework for adversarial attacks accommodating many techniques in literature and novel proposals, as our newer detector exploiting non-linear dimensionality reduction techniques at its core. Finally, abstracting the analyzed defenses in our framework we identified common drawbacks which we propose to overcome with a fast adversarial examples detection technique, capable of a sensible overhead reduction without sacrificing detectors accuracy both on clean data and under attack.
Deep Learning Safety under Non-Stationarity Assumptions
2021
Abstract
Deep Learning (DL) is having a transformational effect in critical areas such as finance, healthcare, transportation, and defense, impacting nearly every aspect of our lives. Many businesses, eager to capitalize on advancements in DL, may have not scrutinized the potential induced security issues of including such intelligent components in their systems. Building a trustworthy DL system requires enforcing key properties, including robustness, privacy, and accountability. This thesis aims to contribute to enhancing DL model’s robustness to input distribution drifts, i.e. situations where training and test distribution differ. Notably, input distribution drifts may happen both naturally — induced by missing input data, e.g. due to some sensor fault — or adversarially, i.e. by an attacker to induce model behavior as desired. Through this thesis, we firstly provide a technique for making DL models robust to missing inputs by design, inducing resilience even in the case of sequential tasks. Then, we propose a detection framework for adversarial attacks accommodating many techniques in literature and novel proposals, as our newer detector exploiting non-linear dimensionality reduction techniques at its core. Finally, abstracting the analyzed defenses in our framework we identified common drawbacks which we propose to overcome with a fast adversarial examples detection technique, capable of a sensible overhead reduction without sacrificing detectors accuracy both on clean data and under attack.File | Dimensione | Formato | |
---|---|---|---|
diss.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
10.82 MB
Formato
Adobe PDF
|
10.82 MB | Adobe PDF | Visualizza/Apri |
relazione.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
132.01 kB
Formato
Adobe PDF
|
132.01 kB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/148294
URN:NBN:IT:UNIPI-148294