The proliferation of bots on Online Social Networks (OSNs) represents a growing threat to public discourse, misinformation, and the manipulation of public opinion. The challenge of detecting these bots has intensified with the advent of Large Language Model (LLM), which can generate content nearly indistinguishable from that produced by humans, rendering many traditional detection methods obsolete. This thesis addresses the problem through the study and development of new techniques for both bot detection and simulation. The work is grounded in the concept of Digital DNA, from which several contributions are presented. The first contribution involves image-based detection, where, in a completely novel approach, D-DNA sequences are transformed into images and subsequently classified using Convolutional Neural Network (CNN). The underlying hypothesis, confirmed by the results, is that the behavioral patterns of bots and humans generate discriminative visual features, demonstrating effectiveness comparable to the state-of-the-art methods, particularly against LLM-based bots. The second main contribution consists of an efficient, training-free classification methodology based on hashing techniques. This approach proves to be computationally lightweight, practical even with limited data, and suitable for early detection, outperforming several complex machine learning and deep learning models across multiple datasets. Finally, the third main contribution is GenBot, a framework for simulating LLM bots whose behavior is based on the Digital DNA of real users. This controlled environment was used to evaluate the robustness of security mechanisms in various open-source LLMs, revealing a significant tendency to generate toxic content. Overall, this thesis contributes to the Social Bot Detection (SBD) field with innovative, efficient, and robust techniques, while providing a critical analysis of the emerging challenges posed by the new generation of AI-enhanced bots.
Behavioral modeling of social bots: from detection to simulation
DI PAOLO, EDOARDO
2026
Abstract
The proliferation of bots on Online Social Networks (OSNs) represents a growing threat to public discourse, misinformation, and the manipulation of public opinion. The challenge of detecting these bots has intensified with the advent of Large Language Model (LLM), which can generate content nearly indistinguishable from that produced by humans, rendering many traditional detection methods obsolete. This thesis addresses the problem through the study and development of new techniques for both bot detection and simulation. The work is grounded in the concept of Digital DNA, from which several contributions are presented. The first contribution involves image-based detection, where, in a completely novel approach, D-DNA sequences are transformed into images and subsequently classified using Convolutional Neural Network (CNN). The underlying hypothesis, confirmed by the results, is that the behavioral patterns of bots and humans generate discriminative visual features, demonstrating effectiveness comparable to the state-of-the-art methods, particularly against LLM-based bots. The second main contribution consists of an efficient, training-free classification methodology based on hashing techniques. This approach proves to be computationally lightweight, practical even with limited data, and suitable for early detection, outperforming several complex machine learning and deep learning models across multiple datasets. Finally, the third main contribution is GenBot, a framework for simulating LLM bots whose behavior is based on the Digital DNA of real users. This controlled environment was used to evaluate the robustness of security mechanisms in various open-source LLMs, revealing a significant tendency to generate toxic content. Overall, this thesis contributes to the Social Bot Detection (SBD) field with innovative, efficient, and robust techniques, while providing a critical analysis of the emerging challenges posed by the new generation of AI-enhanced bots.| File | Dimensione | Formato | |
|---|---|---|---|
|
Tesi_dottorato_Di_Paolo.pdf
accesso aperto
Licenza:
Creative Commons
Dimensione
7.3 MB
Formato
Adobe PDF
|
7.3 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/357364
URN:NBN:IT:UNIROMA1-357364