Improving fake image detection through background analysis, facial segmentation, and model interpretability

Tanfoni, Marco

The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process.

Improving fake image detection through background analysis, facial segmentation, and model interpretability

TANFONI, MARCO

2025

Abstract

The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione e Scienze Matematiche
			
	Data di pubblicazione
	
				14-apr-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				MAGGINI, MARCO
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				MAGGINI, MARCO
BIANCHINI, MONICA
			
	Nome Editore
	
				Università degli Studi di Siena
			
	Città Editore
	
				Siena
			
	Numero di pagine
	
				86
			
	Collezione di appartenenza
	
				Università degli Studi di Siena

File in questo prodotto:

File	Dimensione	Formato
phd_unisi_118559.pdf accesso aperto Dimensione 14.23 MB Formato Adobe PDF Visualizza/Apri	14.23 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/202271

Il codice NBN di questa tesi è URN:NBN:IT:UNISI-202271