Definition and learning of logic-based kernels for categorical data, and application to collaborative filtering

Polato, Mirko

The continuous pursuit of better prediction quality has gradually led to the development of increasingly complex machine learning models, e.g., deep neural networks. Despite the great success in many domains, the black-box nature of these models makes them not suitable for applications in which the model understanding is at least as important as the prediction accuracy, such as medical applications. On the other hand, more interpretable models, as decision trees, are in general much less accurate. In this thesis, we try to merge the positive aspects of these two realities, by injecting interpretable elements inside complex methods. We focus on kernel methods which have an elegant framework that decouples learning algorithms from data representations. In particular, the first main contribution of this thesis is the proposal of a new family of Boolean kernels, i.e., kernels defined on binary data, with the aim of creating interpretable feature spaces. Assuming binary input vectors, the core idea is to build embedding spaces in which the dimensions represent logical formulas (of a specific form) of the input variables. As a result the solution of a kernel machine can be represented as a weighted sum of logical propositions, and this allows to extract from it human-readable rules. Our framework provides a constructive and efficient way to calculate Boolean kernels of different forms (e.g., disjunctive, conjunctive, DNF, CNF). We show that on binary classification tasks over categorical datasets the proposed kernels achieve state-of-the-art performances. We also provide some theoretical properties about the expressiveness of such kernels. The second main contribution consists in the development of a new multiple kernel learning algorithm to automatically learn the best representation (avoiding the validation). We start from a theoretical result which states that, under mild conditions, any dot-product kernel can be seen as a linear non-negative combination of Boolean conjunctive kernels. Then, from this combination, our MKL algorithm learns non-parametrically the best combination of the conjunctive kernels. This algorithm is designed to optimize the radius-margin ratio of the combined kernel, which has been demonstrated of being an upper bound of the Leave-One-Out error. An extensive empirical evaluation, on several binary classification tasks, shows how our MKL technique is able to outperform state-of-the-art MKL approaches. A third contribution is the proposal of another kernel family for binary input data, which aims to overcome the limitations of the Boolean kernels. In this case the focus is not exclusively on the interpretability, but also on the expressivity. With this new framework, that we dubbed propositional kernel framework, is possible to build kernel functions able to create feature spaces containing almost any kind of logical propositions. Finally, the last contribution is the application of the Boolean kernels to Recommender Systems, specifically, on top-N recommendation tasks. First of all, we propose a novel kernel-based collaborative filtering method and we apply on top of it our Boolean kernels. Empirical results on several collaborative filtering datasets show how less expressive kernels can alleviate the sparsity issue, which is peculiar in this kind of applications.

Definition and learning of logic-based kernels for categorical data, and application to collaborative filtering

POLATO, MIRKO

2018

Abstract

The continuous pursuit of better prediction quality has gradually led to the development of increasingly complex machine learning models, e.g., deep neural networks. Despite the great success in many domains, the black-box nature of these models makes them not suitable for applications in which the model understanding is at least as important as the prediction accuracy, such as medical applications. On the other hand, more interpretable models, as decision trees, are in general much less accurate. In this thesis, we try to merge the positive aspects of these two realities, by injecting interpretable elements inside complex methods. We focus on kernel methods which have an elegant framework that decouples learning algorithms from data representations. In particular, the first main contribution of this thesis is the proposal of a new family of Boolean kernels, i.e., kernels defined on binary data, with the aim of creating interpretable feature spaces. Assuming binary input vectors, the core idea is to build embedding spaces in which the dimensions represent logical formulas (of a specific form) of the input variables. As a result the solution of a kernel machine can be represented as a weighted sum of logical propositions, and this allows to extract from it human-readable rules. Our framework provides a constructive and efficient way to calculate Boolean kernels of different forms (e.g., disjunctive, conjunctive, DNF, CNF). We show that on binary classification tasks over categorical datasets the proposed kernels achieve state-of-the-art performances. We also provide some theoretical properties about the expressiveness of such kernels. The second main contribution consists in the development of a new multiple kernel learning algorithm to automatically learn the best representation (avoiding the validation). We start from a theoretical result which states that, under mild conditions, any dot-product kernel can be seen as a linear non-negative combination of Boolean conjunctive kernels. Then, from this combination, our MKL algorithm learns non-parametrically the best combination of the conjunctive kernels. This algorithm is designed to optimize the radius-margin ratio of the combined kernel, which has been demonstrated of being an upper bound of the Leave-One-Out error. An extensive empirical evaluation, on several binary classification tasks, shows how our MKL technique is able to outperform state-of-the-art MKL approaches. A third contribution is the proposal of another kernel family for binary input data, which aims to overcome the limitations of the Boolean kernels. In this case the focus is not exclusively on the interpretability, but also on the expressivity. With this new framework, that we dubbed propositional kernel framework, is possible to build kernel functions able to create feature spaces containing almost any kind of logical propositions. Finally, the last contribution is the application of the Boolean kernels to Recommender Systems, specifically, on top-N recommendation tasks. First of all, we propose a novel kernel-based collaborative filtering method and we apply on top of it our Boolean kernels. Empirical results on several collaborative filtering datasets show how less expressive kernels can alleviate the sparsity issue, which is peculiar in this kind of applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				BRAIN, MIND AND COMPUTER SCIENCE
			
	Data di pubblicazione
	
				12-gen-2018
			
	Lingua
	
				Inglese
			
	Parola chiave
	
				kernel methods; boolean kernel; recommender systems
			
	Relatore, Supervisor, Advisor o Tutor
	
				Aiolli, Fabio
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				SARTORI, GIUSEPPE
			
	Nome Editore
	
				Università degli studi di Padova
			
	Collezione di appartenenza
	
				Università degli Studi di Padova

File in questo prodotto:

File	Dimensione	Formato
mirko_polato_tesi.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 1.27 MB Formato Adobe PDF Visualizza/Apri	1.27 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/110798

Il codice NBN di questa tesi è URN:NBN:IT:UNIPD-110798