Optimizing machine learning: enhancing interpretability and performance through mathematical optimization

Salvatore, Cecilia

Classification systems based on Machine Learning algorithms are often used to support decision-making in real-world applications such as healthcare (Babic et al., 2021), credit approval (Silva et al., 2022; Kozodoi et al., 2022; Bastos and Matos, 2022; Dumitrescu et al., 2022), or criminal justice (Ridgeway, 2013). These systems often act as black-boxes that lack interpretability. This opacity hinders our ability to comprehend the decisionmaking processes within these systems, raising concerns about their trustworthiness and ethical deployment. Making Machine Learning systems trustworthy has become imperative, and interpretability, robustness, and fairness are often essential requirements for deployment. Interpretability is essential for understanding how these systems reach their decisions, allowing stakeholders to comprehend the underlying processes; robustness ensures that the models perform reliably under various conditions; fairness addresses concerns related to bias and discrimination in decision outcomes. In response to these challenges, regulatory bodies, including the European Union (EU), have recognized the importance of establishing guidelines and frameworks for the ethical and responsible deployment of artificial intelligence. Already in 2018, the General Data Protection Regulation (GDPR, Council of the European Union (2018)) adopted by the EU introduced the concept of “right to explanation” (Goodman and Flaxman, 2017) to address the transparency and accountability of automated decision-making processes, including those facilitated by artificial intelligence (AI) and machine learning systems. The “right to explanation” reflects the GDPR’s commitment to ensuring that individuals maintain control over their personal data and are not subjected to arbitrary or unfair decisions by automated systems. It encourages organizations to adopt transparent and accountable practices in their use of AI and machine learning algorithms, promoting trust between data subjects and data controllers. Compliance with the “right to explanation” is cru cial for organizations to demonstrate their commitment to ethical and responsible data processing practices. Later, in 2020, the European Commission in its White Paper on Artificial Intelligence (European Commission, 2020) emphasized the need for a coordinated European approach to AI, addressing both the benefits and potential risks associated with its use. In 2023 the European Parliament and the Council reached an agreement for the EU AI Act, the first-ever comprehensive legal framework on Artificial Intelligence worldwide (Panigutti et al., 2023). The EU AI Act reflects the growing recognition of the need to balance innovation with ethical considerations and societal impacts. By establishing a regulatory framework, the EU aims to promote the responsible development and use of AI, fostering public trust and ensuring that these technologies align with fundamental values and principles. Other regions and countries are also exploring or implementing similar regulatory measures to address the challenges posed by AI and machine learning applications. While the regulatory landscape emphasizes the importance of interpretability in machine learning systems, achieving it from a technical standpoint involves employing specific methodologies and techniques (Molnar et al., 2020). Firstly, is it important to outline the semantic difference between interpretability and explainability (Rudin et al., 2022). The term interpretability refers, in fact, to the inherent transparency and explainability of certain machine learning models or algorithms; in this approach, interpretability is part of the model’s design and is built into its structure. Examples of interpretable models can be Linear Models, Decision Trees, or Sparse Models (Hastie et al., 2009). On the other hand, explainability refers to the process of generating explanations for the decisions made by complex and inherently opaque machine learning models; these explanations are generated after the model has made a prediction and are not part of the model’s original design. While both interpretability and explainability contribute to addressing the challenges associated with opaque machine learning models, interpretability is often considered more advantageous as interpretable models provide transparency as a fundamental characteristic (Rudin, 2019); their structure and decision-making process are clear by design, eliminating the need for additional post-hoc explanations. Models designed for interpretability also tend to be less complex, and simplicity in model structure not only aids in understanding but also reduces the risk of overfitting and enhances generalization (Vapnik et al., 1994). On the other hand, one could object that simplicity could also impact the predictive performance of the model, by increasing the risk of underfitting. This sets the basis for a trade-off between predictive performance and interpretability of a model. Goethals et al. (2022) provide a systematic study, based on the analysis of 90 benchmark classification datasets, to shed light on this trade-off. The study reveals that the trade-off exists for the majority (69%) of the datasets. However, surprisingly, for most cases, the trade-off is rather small. Only a few datasets exhibit a significant trade-off between predictive performance and interpretability. The authors also show that some dataset characteristics related to complexity and noise play a significant role in explaining the difference between the performance of a black-box model and the performance of a white-box model, which they call the cost of comprehensibility. Efforts to mitigate the cost of comprehensibility involve advanced techniques that strike a balance between transparency and accuracy. Notably, mathematical optimization emerges as a powerful tool in designing models that optimize both accuracy and interpretability (Carrizosa et al., 2021). Optimization algorithms can enforce constraints to guide the learning process, producing models that are not only accurate but also transparent. The objective of this thesis is to make a contribution to this field of research. In particular, we focus on Supervised Discretization (Dougherty et al., 1995) of a dataset as a means to provide an interpretable view of data, that reduces the intrinsic noise of a dataset without losing its predictive expression. We also show that Supervised Discretization can be used as a preprocessing phase that allows to efficiently and effectively train Optimal Classification Trees. The remainder of this thesis is structured as follows. In Chapter 2 we analyze the related literature about optimal classification trees, we introduce the importance of supervised discretization in this context and analyze the existing literature in this field. In Chapter 3 we describe FCCA, a possible approach for Supervised Discretization in which we leverage Counterfactual Explanations, a post-hoc interpretability technique, to detect the important decision boundaries of a pre-trained black-box model. In Chapter 4 we formalize the problem of Supervised Discretization by means of combinatorial optimization, and study possible algorithms and solutions to address this problem. In Chapter 5 we study a real-world database that contains data from diabetic patients across Italy with the objective of developing a machine learning framework to predict the onset of nephropathy, a common complication in diabetic patients; we underline the importance of interpretability techniques and Supervised Discretization to provide useful insights on the predictions made by the machine learning model. Finally, in Chapter 6 we draw some conclusions and propose some lines of future research.

Optimizing machine learning: enhancing interpretability and performance through mathematical optimization

SALVATORE, CECILIA

2024

Abstract

Classification systems based on Machine Learning algorithms are often used to support decision-making in real-world applications such as healthcare (Babic et al., 2021), credit approval (Silva et al., 2022; Kozodoi et al., 2022; Bastos and Matos, 2022; Dumitrescu et al., 2022), or criminal justice (Ridgeway, 2013). These systems often act as black-boxes that lack interpretability. This opacity hinders our ability to comprehend the decisionmaking processes within these systems, raising concerns about their trustworthiness and ethical deployment. Making Machine Learning systems trustworthy has become imperative, and interpretability, robustness, and fairness are often essential requirements for deployment. Interpretability is essential for understanding how these systems reach their decisions, allowing stakeholders to comprehend the underlying processes; robustness ensures that the models perform reliably under various conditions; fairness addresses concerns related to bias and discrimination in decision outcomes. In response to these challenges, regulatory bodies, including the European Union (EU), have recognized the importance of establishing guidelines and frameworks for the ethical and responsible deployment of artificial intelligence. Already in 2018, the General Data Protection Regulation (GDPR, Council of the European Union (2018)) adopted by the EU introduced the concept of “right to explanation” (Goodman and Flaxman, 2017) to address the transparency and accountability of automated decision-making processes, including those facilitated by artificial intelligence (AI) and machine learning systems. The “right to explanation” reflects the GDPR’s commitment to ensuring that individuals maintain control over their personal data and are not subjected to arbitrary or unfair decisions by automated systems. It encourages organizations to adopt transparent and accountable practices in their use of AI and machine learning algorithms, promoting trust between data subjects and data controllers. Compliance with the “right to explanation” is cru cial for organizations to demonstrate their commitment to ethical and responsible data processing practices. Later, in 2020, the European Commission in its White Paper on Artificial Intelligence (European Commission, 2020) emphasized the need for a coordinated European approach to AI, addressing both the benefits and potential risks associated with its use. In 2023 the European Parliament and the Council reached an agreement for the EU AI Act, the first-ever comprehensive legal framework on Artificial Intelligence worldwide (Panigutti et al., 2023). The EU AI Act reflects the growing recognition of the need to balance innovation with ethical considerations and societal impacts. By establishing a regulatory framework, the EU aims to promote the responsible development and use of AI, fostering public trust and ensuring that these technologies align with fundamental values and principles. Other regions and countries are also exploring or implementing similar regulatory measures to address the challenges posed by AI and machine learning applications. While the regulatory landscape emphasizes the importance of interpretability in machine learning systems, achieving it from a technical standpoint involves employing specific methodologies and techniques (Molnar et al., 2020). Firstly, is it important to outline the semantic difference between interpretability and explainability (Rudin et al., 2022). The term interpretability refers, in fact, to the inherent transparency and explainability of certain machine learning models or algorithms; in this approach, interpretability is part of the model’s design and is built into its structure. Examples of interpretable models can be Linear Models, Decision Trees, or Sparse Models (Hastie et al., 2009). On the other hand, explainability refers to the process of generating explanations for the decisions made by complex and inherently opaque machine learning models; these explanations are generated after the model has made a prediction and are not part of the model’s original design. While both interpretability and explainability contribute to addressing the challenges associated with opaque machine learning models, interpretability is often considered more advantageous as interpretable models provide transparency as a fundamental characteristic (Rudin, 2019); their structure and decision-making process are clear by design, eliminating the need for additional post-hoc explanations. Models designed for interpretability also tend to be less complex, and simplicity in model structure not only aids in understanding but also reduces the risk of overfitting and enhances generalization (Vapnik et al., 1994). On the other hand, one could object that simplicity could also impact the predictive performance of the model, by increasing the risk of underfitting. This sets the basis for a trade-off between predictive performance and interpretability of a model. Goethals et al. (2022) provide a systematic study, based on the analysis of 90 benchmark classification datasets, to shed light on this trade-off. The study reveals that the trade-off exists for the majority (69%) of the datasets. However, surprisingly, for most cases, the trade-off is rather small. Only a few datasets exhibit a significant trade-off between predictive performance and interpretability. The authors also show that some dataset characteristics related to complexity and noise play a significant role in explaining the difference between the performance of a black-box model and the performance of a white-box model, which they call the cost of comprehensibility. Efforts to mitigate the cost of comprehensibility involve advanced techniques that strike a balance between transparency and accuracy. Notably, mathematical optimization emerges as a powerful tool in designing models that optimize both accuracy and interpretability (Carrizosa et al., 2021). Optimization algorithms can enforce constraints to guide the learning process, producing models that are not only accurate but also transparent. The objective of this thesis is to make a contribution to this field of research. In particular, we focus on Supervised Discretization (Dougherty et al., 1995) of a dataset as a means to provide an interpretable view of data, that reduces the intrinsic noise of a dataset without losing its predictive expression. We also show that Supervised Discretization can be used as a preprocessing phase that allows to efficiently and effectively train Optimal Classification Trees. The remainder of this thesis is structured as follows. In Chapter 2 we analyze the related literature about optimal classification trees, we introduce the importance of supervised discretization in this context and analyze the existing literature in this field. In Chapter 3 we describe FCCA, a possible approach for Supervised Discretization in which we leverage Counterfactual Explanations, a post-hoc interpretability technique, to detect the important decision boundaries of a pre-trained black-box model. In Chapter 4 we formalize the problem of Supervised Discretization by means of combinatorial optimization, and study possible algorithms and solutions to address this problem. In Chapter 5 we study a real-world database that contains data from diabetic patients across Italy with the objective of developing a machine learning framework to predict the onset of nephropathy, a common complication in diabetic patients; we underline the importance of interpretability techniques and Supervised Discretization to provide useful insights on the predictions made by the machine learning model. Finally, in Chapter 6 we draw some conclusions and propose some lines of future research.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Computer science, control and geoinformation
			
	Data di pubblicazione
	
				2024
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				PICCIALLI, VERONICA
			
	Nome Editore
	
				Università degli Studi di Roma "Tor Vergata"
			
	Collezione di appartenenza
	
				Università degli Studi di Roma Tor Vergata

File in questo prodotto:

File	Dimensione	Formato
Tesi.pdf accesso solo da BNCF e BNCR Licenza: Tutti i diritti riservati Dimensione 3.2 MB Formato Adobe PDF	3.2 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/310019

Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA2-310019