The impact of deep learning on malware detection: from evaluating the robustness of malware detectors to offensive learning

Mario, D'Onghia

Deep Learning, a sub-field of Machine Learning focusing on Deep Neural Networks, has revolutionized many aspects of computer science, from computer vision to natural language processing. It has had a deep impact on cybersecurity as well. In particular, multiple methodologies, exploiting the ability of Deep Neural Networks to extract relevant characteristics from raw data, have been proposed for malware detection. These range from the so-called “malware images,” namely grayscale images obtained from the raw bytes of binary programs and classified by omputer vision-like models, to Recurrent Neural Networks that classify variable length sequences of API calls. As with other applications of cybersecurity, proactively testing the robustness of these models is required to anticipate real attacks by ill-intentioned adversaries, while also sanitizing possible vulnerabilities. This thesis takes exactly this approach, by highlighting security problems related to the application of Deep Neural Networks for malware detection. In particular, it first studies the feasibility of backdooring attacks, a class of training time attacks aiming to violate the integrity of the model, against Convolutional Neural Networks that classify the raw bytes of executable programs. A backdoored model for malware detection will “let through” malware signed with a special watermark, called the trigger. Two types of attacks are designed and evaluated against state-of-the-art Convolutional Neural Networks for raw-byte malware detection, concluding that these models are not resilient against attacks perpetrated by either an insider (for instance, the AV provider itself), or an outsider, which may exploit malware harvesters such as threat intelligence platforms (e.g., VirusTotal) or the “remote analysis” functionality provided by many commercial AVs. The second contribution of this thesis is the study of the robustness of Recurrent Neural Networks processing the sequences of API calls performed by a program while executing against test time attacks. Although this is not the first work to investigate this class of attacks, it is the first and only to address the nondeterministic, or even probabilistic, nature of malware behaviors, which may impact the success of naive adversarial attacks. It proposes the Position Sensitive - Fast Gradient Sign Method, an optimization algorithm for computing optimal modifications to sequential data processed by a Recurrent Neural Network. Moreover, it introduces two strategies to run an optimization algorithm (including the Position Sensitive - Fast Gradient Sign Method) with data that may change between different observations: in this case, the run-time behavior of complex programs. Similarly to the first contribution, this thesis concludes that even advanced Long Short-Term Memory models for dynamic malware detection are not sufficiently robust against this class of attacks, even in a full black-box scenario. The third contribution is the introduction of a new learning paradigm, which this thesis introduces by the name of Offensive Learning, that an attacker can use to learn the discriminative features employed by a black-box system to classify an attacker-controlled sample. Specifically to malware detection, this thesis presents ANNtivirus, an implementation of Offensive Learning that specializes in learning the specific bytes of a malware sample that cause its detection by a target AV. ANNtivirus, and more generally Offensive Learning, can guide an attacker in constructing malware samples able to evade detection without recurring to complex obfuscation methods, such as polymorphism and metamorphism. Moreover, an attacker could exploit the information obtained by ANNtivirus to selectively obfuscate previously existing malware samples, by distorting only the specific bytes that the target AV recognizes as an indicator of maliciousness. ANNtivirus is validated against different approaches to malware detection, including signature detection, Deep Learning-based, and real-world commercial AVs, which are expected to employ multiple techniques simultaneously. The implication of this thesis, complemented by previous work on adversarial attacks against Deep Learning-based malware detection, is that current Deep Learning methodologies may not be robust against real-world attackers. In its conclusive chapter, it also briefly discusses current research trends for improving the resilience of Deep Learning methodologies, while claiming that malware detection, and the correspondent usage of Deep Learning, should be thought over. Lastly, this thesis also introduces two secondary contributions, namely a Graph Neural Network-based approach to code packer identification and a static analysis framework for identifying API calls in unstructured binaries.

Il deep learning, il ramo dell’apprendimento automatico specializzato nello studio delle reti neurali profonde, ha rivoluzionato svariati aspetti dell’informatica, dalla visione artificiale all’elaborazione del linguaggio naturale, arrivando ad impattare profondamente anche la sicurezza informatica. In particolare, sono state proposte svariate metodologie che sfruttano l’abilità delle reti neurali profonde di estratte caratteristiche rilevanti da input non processati per la rilevazione dei programmi malevoli. Queste variano dalle cosiddette “immagini malware”, ovvero immagini in bianco e nero ricavate direttamente dai byte di un programma e classificate da modelli originariamente proposti per la visione artificiale, all’utilizzo di reti neurali ricorrenti in grado di classificare sequenze di chiamate ad API dalla lunghezza variabile. Analogamente ad altre applicazioni della sicurezza informatica, testare attivamente la robustezza dei propri modelli è necessario per anticipare attacchi da parte di avversari reali e per sanitizzare possibili vulnerabilità. Questo è anche l’approccio impiegato in questa tesi, il cui contributo principale è proprio quello di evidenziare problemi di sicurezza legati all’applicazione delle reti neurali profonde nell’ambito della rilevazione di programmi malevoli. In particolare, uno studio sulla fattibilità degli attacchi di “backdooring” contro le reti neurali convoluzionali che classificano direttamente i byte dei programmi eseguibili costituisce il primo contributo di questa tesi. Un modello per la rilevazione di programmi malevoli che presenta un backdoor lascerà passare i programmi malevoli firmati con una speciale “filigrana”, chiamata in gergo “innesco”. Due classi di attacchi sono state progettate e valutate contro le più avanzate reti neurali convoluzionali di questa tipologia, concludendo che questi modelli di rilevazione non sono sufficientemente robusti contro questi attacchi, indipendentemente dal fatto che essi siano perpetrati da attori interni (ad esempio, l’azienda che produce il software antivirus) o esterni. Questa tesi presenta anche uno studio sulla robustezza delle reti neurali ricorrenti usate per classificare le sequenze di chiamate ad API: è importante menzionare, tuttavia, che questo non è il primo lavoro che si propone di fare ciò. In compenso, esso è il primo studio che valuta l’impatto della natura non-deterministica o, addirittura, probabilistica dei programmi malevoli sul successo dei cosiddetti attacchi avversariali. In tal senso, un primo importante contributo è il Position Sensitive - Fast Gradient Sign Method, un algoritmo di ottimizzazione in grado di calcolare le modifiche ottimali da apportare a dati sequenziali processati da una rete neurale ricorrente. In aggiunta, la tesi introduce due strategie per migliorare le probabilità di successo di un attacco basato su un processo di ottimizzazione (incluso il Position Sensitive - Fast Gradient Sign Method) in presenza di dati che cambiano fra diverse osservazioni: in questo caso specifico, il comportamento di programmi complessi. Come per il primo contributo, la conclusione raggiunta da questo lavoro è che anche i più avanzati modelli Long Short-Term Memory per la rilevazione di programmi malevoli non sono sufficientemente robusti contro questa classe di attacchi, neanche in uno scenario black-box. Il terzo contributo di questa tesi è l’introduzione di un nuovo paradigma di apprendimento automatico, lo Offensive Learning (apprendimento offensivo), il quale può essere usato da un attaccante per imparare le caratteristiche discriminanti utilizzate da un sistema black-box per classificare campioni controllati da esso. Nel caso specifico della rilevazione di programmi malevoli, questa tesi introduce ANNtivirus, un’implementazione del Offensive Learning utilizzabile da un attaccante per imparare i byte specifici impiegati da un antivirus per rilevare il programma malevolo da egli controllato. ANNtivirus, e più in generale lo Offensive Learning, può guidare un attaccante nella costruzione di programmi malevoli in grado di evadere i sistemi di rilevazione senza dover ricorrere a metodi più invasivi, come il polimorfismo o il metamorfismo. Inoltre, un attaccante potrebbe usare ANNtivirus per offuscare in modo selettivo programmi malevoli precedentemente esistenti, distorcendo esclusivamente i byte usati dal antivirus come indicatore di malevolenza. ANNtivirus è stato validato contro diversi approcci alla rilevazione di programmi malevoli, da quelli basati sulla rilevazione di firme, quelli basati su deep learning, agli antivirus commerciali, i quali presumibilmente utilizzano più tecniche simultaneamente. La conclusione di questa tesi, complementata dai lavori precedenti su attacchi avversariali contro sistemi di rilevazione basati su deep learning, è che le metodologie correnti non sono sufficientemente robuste contro gli attaccanti del mondo reale. Il capitolo conclusivo, inoltre, discute quelle che sono le attuali direzioni di ricerca, proponendo tuttavia di ripensare interamente il modo in cui usiamo l’apprendimento automatico in questo settore. Infine, questa tesi introduce due contributi secondari: un sistema basato su Graph Neural Network per l’identificazione di packer e un sistema di analisi statico del codice per l’identificazione di chiamate API in codice non strutturato.