The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process.
Improving fake image detection through background analysis, facial segmentation, and model interpretability
TANFONI, MARCO
2025
Abstract
The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process.File | Dimensione | Formato | |
---|---|---|---|
phd_unisi_118559.pdf
accesso aperto
Dimensione
14.23 MB
Formato
Adobe PDF
|
14.23 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/202271
URN:NBN:IT:UNISI-202271