The proliferation of bots on Online Social Networks (OSNs) represents a growing threat to public discourse, misinformation, and the manipulation of public opinion. The challenge of detecting these bots has intensified with the advent of Large Language Model (LLM), which can generate content nearly indistinguishable from that produced by humans, rendering many traditional detection methods obsolete. This thesis addresses the problem through the study and development of new techniques for both bot detection and simulation. The work is grounded in the concept of Digital DNA, from which several contributions are presented. The first contribution involves image-based detection, where, in a completely novel approach, D-DNA sequences are transformed into images and subsequently classified using Convolutional Neural Network (CNN). The underlying hypothesis, confirmed by the results, is that the behavioral patterns of bots and humans generate discriminative visual features, demonstrating effectiveness comparable to the state-of-the-art methods, particularly against LLM-based bots. The second main contribution consists of an efficient, training-free classification methodology based on hashing techniques. This approach proves to be computationally lightweight, practical even with limited data, and suitable for early detection, outperforming several complex machine learning and deep learning models across multiple datasets. Finally, the third main contribution is GenBot, a framework for simulating LLM bots whose behavior is based on the Digital DNA of real users. This controlled environment was used to evaluate the robustness of security mechanisms in various open-source LLMs, revealing a significant tendency to generate toxic content. Overall, this thesis contributes to the Social Bot Detection (SBD) field with innovative, efficient, and robust techniques, while providing a critical analysis of the emerging challenges posed by the new generation of AI-enhanced bots.

Behavioral modeling of social bots: from detection to simulation

DI PAOLO, EDOARDO
2026

Abstract

The proliferation of bots on Online Social Networks (OSNs) represents a growing threat to public discourse, misinformation, and the manipulation of public opinion. The challenge of detecting these bots has intensified with the advent of Large Language Model (LLM), which can generate content nearly indistinguishable from that produced by humans, rendering many traditional detection methods obsolete. This thesis addresses the problem through the study and development of new techniques for both bot detection and simulation. The work is grounded in the concept of Digital DNA, from which several contributions are presented. The first contribution involves image-based detection, where, in a completely novel approach, D-DNA sequences are transformed into images and subsequently classified using Convolutional Neural Network (CNN). The underlying hypothesis, confirmed by the results, is that the behavioral patterns of bots and humans generate discriminative visual features, demonstrating effectiveness comparable to the state-of-the-art methods, particularly against LLM-based bots. The second main contribution consists of an efficient, training-free classification methodology based on hashing techniques. This approach proves to be computationally lightweight, practical even with limited data, and suitable for early detection, outperforming several complex machine learning and deep learning models across multiple datasets. Finally, the third main contribution is GenBot, a framework for simulating LLM bots whose behavior is based on the Digital DNA of real users. This controlled environment was used to evaluate the robustness of security mechanisms in various open-source LLMs, revealing a significant tendency to generate toxic content. Overall, this thesis contributes to the Social Bot Detection (SBD) field with innovative, efficient, and robust techniques, while providing a critical analysis of the emerging challenges posed by the new generation of AI-enhanced bots.
29-gen-2026
Inglese
SPOGNARDI, Angelo
QUERZONI, Leonardo
Università degli Studi di Roma "La Sapienza"
File in questo prodotto:
File Dimensione Formato  
Tesi_dottorato_Di_Paolo.pdf

accesso aperto

Licenza: Creative Commons
Dimensione 7.3 MB
Formato Adobe PDF
7.3 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/357364
Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-357364