Responsible AI in Vision and Language: Ensuring Safety, Ethics, and Transparency in Modern Models

Poppi, Samuele

This thesis examines how Responsible AI principles—safety, ethics, and transparency—can be effectively embedded into modern AI models. As large-scale systems like deep-fake generation and autonomous navigation grow increasingly pervasive, aligning these technologies with societal values, ethical standards, and user privacy becomes imperative. This research tackles these challenges through a series of interrelated contributions. In the domain of deepfake detection and explainability, robust methods were developed using self-supervised models such as DINO to identify and classify synthetic images, including those generated by text-to-image diffusion models, even under adversarial conditions. By introducing visual explainability cues, this work enhanced user trust by identifying specific artifacts indicative of deepfake content. For explainable navigation in embodied AI, a framework was designed to improve transparency in autonomous systems. By integrating a speaker policy and captioning module into a self-supervised exploration agent, the system generated natural language descriptions of its navigational context. The introduction of an explanation map metric ensured better alignment between visual attention and textual recounting, supporting human-robot collaboration. In the area of machine unlearning, this thesis introduced a low-rank unlearning method to remove specific classes or examples from pre-trained models without requiring full access to the original dataset. This approach was extended to enable efficient, on-demand removal of multiple classes during inference, minimizing computational and storage demands while maintaining model effectiveness. To address unsafe content in vision-and-language models, the research introduced Safe-CLIP, a fine-tuned version of CLIP, capable of filtering NSFW content. The development of ViSU, a dataset of safe and unsafe image-text pairs, supported this effort. Safe-CLIP redirected unsafe regions of the embedding space, achieving a balance between minimizing harmful outputs and retaining benign creative functionality. Finally, the robustness of multilingual large language models (LLMs) in the context of safety was investigated. It was found that fine-tuning attacks in one language could compromise safety across all languages, revealing vulnerabilities in these models. To address this, the Safety Information Localization method identified safety-critical parameters, paving the way for more robust alignment practices. Together, these contributions provide both theoretical insights and practical solutions to enhance the reliability, adaptability, and ethics of AI systems. By addressing challenges such as safer navigation, efficient unlearning, and robust NSFW filtering, this research advances the alignment of large-scale AI models with Responsible AI principles.

File	Dimensione	Formato
FinalReport_POPPI_SAMUELE_apr24_signed_pdfa.pdf non disponibili Dimensione 217.68 kB Formato Adobe PDF	217.68 kB	Adobe PDF
Poppi_PhD_AI_Thesis_13_1.pdf accesso aperto Dimensione 29.67 MB Formato Adobe PDF Visualizza/Apri	29.67 MB	Adobe PDF	Visualizza/Apri

Responsible AI in Vision and Language: Ensuring Safety, Ethics, and Transparency in Modern Models

POPPI, SAMUELE

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Responsible AI in Vision and Language: Ensuring Safety, Ethics, and Transparency in Modern Models

POPPI, SAMUELE

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)