DeepFakes, synthetic manipulations of faces produced with generative artificial intelligence, threaten the authenticity of content and expose detectors to the tough task of dealing with a multiplicity of content and great variability, compression levels and deepfake generation pipelines. Against this backdrop, this doctoral thesis investigates how misclassifications in DeepFake detection relate to high-level facial attributes, and how this knowledge can guide more robust and interpretable detectors. The work proceeds in two stages. In a first analysis, a frame-level classifier distinguishes manipulated from authentic content and its errors are examined post hoc. Videos from the dataset are preprocessed by detecting and cropping faces with a cascade classifier. The dataset is enriched through a facial-attribute labeling pipeline that starts from a small manually annotated seed and expands on the whole dataset with per-attribute semi-supervised classifier to derive labels such as gender, hair color, hair length, ear visibility, and ethnicity. Subsequently, was created a DeepFake classifier that delivers achieves good results on the primary subject in each video. Attribute-wise error analysis (including label-level metrics and statistical dependence measures) reveals systematic patterns: in particular, ear visibility and hair length emerge as influential contextual factors that can a"ect decisions. In an extension of the analysis, insights are stress-tested via controlled exclusion experiments that remove one or more values of a given attribute during training, and the related models are evaluated on the complete test set. The results show that some characteristics impact model performance and decision behavior; for example, removing training exposure to certain visibility conditions degrades the detector’s ability at test time. These findings motivate data curation that balances key attribute conditions, applies targeted augmentations, and assesses the influence of attributes on the final outcome. Overall, the thesis contributes a scalable semi-supervised pipeline for attribute labeling and practical guidelines for bias-aware training. The study advances interpretability and tackles the field’s central generalization problem by showing that explicit attribute information can guide data curation and training so that models become more reliable to real-world variability.
AI-generated Deepfakes: Detection and Bias Analysis
Stile, Vittorio
2026
Abstract
DeepFakes, synthetic manipulations of faces produced with generative artificial intelligence, threaten the authenticity of content and expose detectors to the tough task of dealing with a multiplicity of content and great variability, compression levels and deepfake generation pipelines. Against this backdrop, this doctoral thesis investigates how misclassifications in DeepFake detection relate to high-level facial attributes, and how this knowledge can guide more robust and interpretable detectors. The work proceeds in two stages. In a first analysis, a frame-level classifier distinguishes manipulated from authentic content and its errors are examined post hoc. Videos from the dataset are preprocessed by detecting and cropping faces with a cascade classifier. The dataset is enriched through a facial-attribute labeling pipeline that starts from a small manually annotated seed and expands on the whole dataset with per-attribute semi-supervised classifier to derive labels such as gender, hair color, hair length, ear visibility, and ethnicity. Subsequently, was created a DeepFake classifier that delivers achieves good results on the primary subject in each video. Attribute-wise error analysis (including label-level metrics and statistical dependence measures) reveals systematic patterns: in particular, ear visibility and hair length emerge as influential contextual factors that can a"ect decisions. In an extension of the analysis, insights are stress-tested via controlled exclusion experiments that remove one or more values of a given attribute during training, and the related models are evaluated on the complete test set. The results show that some characteristics impact model performance and decision behavior; for example, removing training exposure to certain visibility conditions degrades the detector’s ability at test time. These findings motivate data curation that balances key attribute conditions, applies targeted augmentations, and assesses the influence of attributes on the final outcome. Overall, the thesis contributes a scalable semi-supervised pipeline for attribute labeling and practical guidelines for bias-aware training. The study advances interpretability and tackles the field’s central generalization problem by showing that explicit attribute information can guide data curation and training so that models become more reliable to real-world variability.I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/365209
URN:NBN:IT:UNIMERCATORUM-365209