The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process.

Improving fake image detection through background analysis, facial segmentation, and model interpretability

TANFONI, MARCO
2025

Abstract

The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process.
14-apr-2025
Inglese
MAGGINI, MARCO
MAGGINI, MARCO
BIANCHINI, MONICA
Università degli Studi di Siena
Siena
86
File in questo prodotto:
File Dimensione Formato  
phd_unisi_118559.pdf

accesso aperto

Dimensione 14.23 MB
Formato Adobe PDF
14.23 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/202271
Il codice NBN di questa tesi è URN:NBN:IT:UNISI-202271