NETWORKS OF GROUP EQUIVARIANT NON-EXPANSIVE OPERATORS FOR ARTIFICIAL INTELLIGENCE. MODELS, APPLICATIONS AND INTERPRETABILITY.

Bocchi, Giovanni

Artificial Intelligence (AI) and Machine Learning (ML) are increasingly integrated into our daily lives, yet we often encounter opaque, black-box algorithms driving these systems, which can be deceptive or counterfeit. The pursuit of eXplainable Artificial Intelligence (XAI) aims to develop methods that clarify the decision-making processes of black-box AI systems, making them more understandable and trustworthy for end users, in line with regulatory and policy demands. Another significant challenge facing XAI is creating algorithms that are intrinsically explainable by design. In this context, the mathematical theory of Group Equivariant Non-Expansive Operators (GENEOs) has all the credentials to be at the base of a new generation of ML models with desirable geometric properties. This thesis aims to support this claim by investigating GENEO networks from both an application and theoretical perspective, demonstrating their effectiveness as building blocks for competitive networks in terms of performance and explainability. In essence, a GENEO is an operator between two functional spaces that satisfies two important properties: firstly, there is equivariance, which prescribes that the operator must commute with all the elements of a certain group of transformations of the functional data domain, and secondly, non-expansivity, which requires that the distance between the outputs of a GENEO must be always smaller than the original distance between the inputs, when measured with the appropriate distance. From the point of view of applications, I will present in particular GENEOnet, which is the principal implementation of a GENEO network and that was developed during an industrial collaboration with experts in medicinal chemistry on the problem of protein pocket detection. More generally, I will present and analyze several GENEO networks designed to tackle various types of data and problems. These networks outperform other state-of-the-art domain-specific models while also featuring enhanced explainability and robustness properties. The primary experimental finding is that such networks can deliver strong results relying on a minimal number of learnable parameters, making them easier to interpret and study, while moving towards intrinsically explainable models. Furthermore, the reduced number of parameters helps to maintain the model structure simple and decreases computational costs, compared to more complex models like deep neural networks. From a theoretical perspective, I will introduce new techniques for generating linear GENEOs within the context of graph theory, yielding GENEOs that can commute with the group of graph isomorphisms. These methods are based on the concept of permutant and generalize previously developed results, particularly by using input-dependent permutants to embed specific subgraphs into the graph of interest. Using aggregation results for GENEOs, these subgraph search operators enable the definition of a novel isomorphism test for graphs. I compared this test with other conventional approaches in terms of accuracy and computational cost, and it was found to be superior when considering both criteria simultaneously. The findings in this thesis establish the basis and initial experimental evidence for GENEOs as tools for creating networks that belong to the class of models explainable by design. The ultimate objective of this research is to further substantiate this claim by constructing a library of GENEOs easily used for various use cases.

NETWORKS OF GROUP EQUIVARIANT NON-EXPANSIVE OPERATORS FOR ARTIFICIAL INTELLIGENCE. MODELS, APPLICATIONS AND INTERPRETABILITY.

BOCCHI, GIOVANNI

2025

Abstract

Artificial Intelligence (AI) and Machine Learning (ML) are increasingly integrated into our daily lives, yet we often encounter opaque, black-box algorithms driving these systems, which can be deceptive or counterfeit. The pursuit of eXplainable Artificial Intelligence (XAI) aims to develop methods that clarify the decision-making processes of black-box AI systems, making them more understandable and trustworthy for end users, in line with regulatory and policy demands. Another significant challenge facing XAI is creating algorithms that are intrinsically explainable by design. In this context, the mathematical theory of Group Equivariant Non-Expansive Operators (GENEOs) has all the credentials to be at the base of a new generation of ML models with desirable geometric properties. This thesis aims to support this claim by investigating GENEO networks from both an application and theoretical perspective, demonstrating their effectiveness as building blocks for competitive networks in terms of performance and explainability. In essence, a GENEO is an operator between two functional spaces that satisfies two important properties: firstly, there is equivariance, which prescribes that the operator must commute with all the elements of a certain group of transformations of the functional data domain, and secondly, non-expansivity, which requires that the distance between the outputs of a GENEO must be always smaller than the original distance between the inputs, when measured with the appropriate distance. From the point of view of applications, I will present in particular GENEOnet, which is the principal implementation of a GENEO network and that was developed during an industrial collaboration with experts in medicinal chemistry on the problem of protein pocket detection. More generally, I will present and analyze several GENEO networks designed to tackle various types of data and problems. These networks outperform other state-of-the-art domain-specific models while also featuring enhanced explainability and robustness properties. The primary experimental finding is that such networks can deliver strong results relying on a minimal number of learnable parameters, making them easier to interpret and study, while moving towards intrinsically explainable models. Furthermore, the reduced number of parameters helps to maintain the model structure simple and decreases computational costs, compared to more complex models like deep neural networks. From a theoretical perspective, I will introduce new techniques for generating linear GENEOs within the context of graph theory, yielding GENEOs that can commute with the group of graph isomorphisms. These methods are based on the concept of permutant and generalize previously developed results, particularly by using input-dependent permutants to embed specific subgraphs into the graph of interest. Using aggregation results for GENEOs, these subgraph search operators enable the definition of a novel isomorphism test for graphs. I compared this test with other conventional approaches in terms of accuracy and computational cost, and it was found to be superior when considering both criteria simultaneously. The findings in this thesis establish the basis and initial experimental evidence for GENEOs as tools for creating networks that belong to the class of models explainable by design. The ultimate objective of this research is to further substantiate this claim by constructing a library of GENEOs easily used for various use cases.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Matematica Federigo Enriques
			
	Corso di studio
	
				SCIENZE MATEMATICHE
			
	Data di pubblicazione
	
				18-feb-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				MICHELETTI, ALESSANDRA
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				CIRAOLO, GIULIO
			
	Nome Editore
	
				Università degli Studi di Milano
			
	Città Editore
	
				Milano, Dipartimento di Matematica
			
	Numero di pagine
	
				142
			
	Collezione di appartenenza
	
				Università degli Studi di Milano

File in questo prodotto:

File	Dimensione	Formato
phd_unimi_R13235.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 6.61 MB Formato Adobe PDF Visualizza/Apri	6.61 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/192546

Il codice NBN di questa tesi è URN:NBN:IT:UNIMI-192546