Beyond machine learning : how to exploit problem knowledge in machine learning applications

Liti, Chiara

Machine learning could be defined as the practice of using algorithms to parse data, learn from it, and then make predictions [21]. During the last decades, machine learning research field gained growing interest, and the learning algorithms have shown their potential in many different domains such as health care, finance, and manufacturing. Machine learning algorithms are self-teaching systems, improving themselves as the quality, the quantity, and the knowledge of learning data increase. Electroencephalogram classification in P300- based Brain-Computer Interfaces provides a case in point of the impact of domainknowledge-exploitation within a machine learning application. A Brain-Computer Interface is a system allowing people to interact with the environment bypassing the natural neuromuscular and hormonal outputs of the central nervous system. These interfaces record an user’s brain activity and translate it into control commands for external devices, thus providing the central nervous system with additional artificial outputs. The translate phase is carried out by using a machine learning classification algorithm. In this framework, the P300 Speller, which consists of alpha-numeric symbols arranged within the rows and columns of a matrix, has proven to be particularly successful and robust. Within the P300 Speller, brain responses to target and non-target stimuli — the row and the column containing the desired character represent the target stimuli, whilist the other intensifications are the non-target ones — are typically discriminated using linear classifiers. Based on the assumption that the P300 is elicited for one of the stimuli, the target class is assigned to the stimulus matching the maximum decision value, rather than using the standard linear classifiers discriminant function, i.e. the sign function. In the first chapter of this manuscript, the properties of the stimulation paradigms have been further exploited introducing a new score-based classification function that allows for speeding the classification phase preserving the level of the accuracy. This function has been introduced with the aim of developing an early stopping method, that outperforms the current state-of-the-art. Moreover, the proposed function has been evaluated in both single-user and collaborative brain-computer interfaces, corroborating its potential in different settings. Machine learning revolves also around algorithms, model complexity, and computational complexity. In this framework, the architecture design of an artificial neural network is a time-consuming process that requires a significant computational effort. In fact, given a predictive task, many slightly different networks are typically trained and then compared to identify a good model. Human experts typically choose the structural parameters such as the number of hidden layers, the number of neurons, and the activation functions based on their experience on a similar problem. Once the architectures are designed, their weights and biases, as well as their hyper-parameters — e.g. the learning rate and the regularization coefficient — have to be initialized and tuned within the training process. In recent years, many automated Neural Architecture Search methods have been proposed to enhance the architecture engineering performance. It has been shown that these algorithms outperform manually designed architectures on some tasks such as image classification, object detection, and semantic segmentation. The second part of this thesis deals with a new approach to dynamically adapt the architecture of a neural network during the training phase. In particular, the developed methods, called AdaNet, exploits both a new operation for network-growing, and a novel decision-making process involving a radial-based neural network (RBF) to dinamically modify the architecture. More in details, AdaNet models the architecture design as a sequential decision process, whereby an RBF network identifies the most promising network transformation, by predicting the validation loss of an network obtained via functions-preserving transformation of a given architecture. The first obtained results encouraging to further investigate the proposed approach, considering more complex network transformations.

Beyond machine learning : how to exploit problem knowledge in machine learning applications

LITI, CHIARA

2020

Abstract

Machine learning could be defined as the practice of using algorithms to parse data, learn from it, and then make predictions [21]. During the last decades, machine learning research field gained growing interest, and the learning algorithms have shown their potential in many different domains such as health care, finance, and manufacturing. Machine learning algorithms are self-teaching systems, improving themselves as the quality, the quantity, and the knowledge of learning data increase. Electroencephalogram classification in P300- based Brain-Computer Interfaces provides a case in point of the impact of domainknowledge-exploitation within a machine learning application. A Brain-Computer Interface is a system allowing people to interact with the environment bypassing the natural neuromuscular and hormonal outputs of the central nervous system. These interfaces record an user’s brain activity and translate it into control commands for external devices, thus providing the central nervous system with additional artificial outputs. The translate phase is carried out by using a machine learning classification algorithm. In this framework, the P300 Speller, which consists of alpha-numeric symbols arranged within the rows and columns of a matrix, has proven to be particularly successful and robust. Within the P300 Speller, brain responses to target and non-target stimuli — the row and the column containing the desired character represent the target stimuli, whilist the other intensifications are the non-target ones — are typically discriminated using linear classifiers. Based on the assumption that the P300 is elicited for one of the stimuli, the target class is assigned to the stimulus matching the maximum decision value, rather than using the standard linear classifiers discriminant function, i.e. the sign function. In the first chapter of this manuscript, the properties of the stimulation paradigms have been further exploited introducing a new score-based classification function that allows for speeding the classification phase preserving the level of the accuracy. This function has been introduced with the aim of developing an early stopping method, that outperforms the current state-of-the-art. Moreover, the proposed function has been evaluated in both single-user and collaborative brain-computer interfaces, corroborating its potential in different settings. Machine learning revolves also around algorithms, model complexity, and computational complexity. In this framework, the architecture design of an artificial neural network is a time-consuming process that requires a significant computational effort. In fact, given a predictive task, many slightly different networks are typically trained and then compared to identify a good model. Human experts typically choose the structural parameters such as the number of hidden layers, the number of neurons, and the activation functions based on their experience on a similar problem. Once the architectures are designed, their weights and biases, as well as their hyper-parameters — e.g. the learning rate and the regularization coefficient — have to be initialized and tuned within the training process. In recent years, many automated Neural Architecture Search methods have been proposed to enhance the architecture engineering performance. It has been shown that these algorithms outperform manually designed architectures on some tasks such as image classification, object detection, and semantic segmentation. The second part of this thesis deals with a new approach to dynamically adapt the architecture of a neural network during the training phase. In particular, the developed methods, called AdaNet, exploits both a new operation for network-growing, and a novel decision-making process involving a radial-based neural network (RBF) to dinamically modify the architecture. More in details, AdaNet models the architecture design as a sequential decision process, whereby an RBF network identifies the most promising network transformation, by predicting the validation loss of an network obtained via functions-preserving transformation of a given architecture. The first obtained results encouraging to further investigate the proposed approach, considering more complex network transformations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Computer science, control and geoinformation
			
	Data di pubblicazione
	
				2020
			
	Lingua
	
				Inglese
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				PICCIALLI, VERONICA
			
	Nome Editore
	
				Università degli Studi di Roma "Tor Vergata"
			
	Collezione di appartenenza
	
				Università degli Studi di Roma Tor Vergata

File in questo prodotto:

File	Dimensione	Formato
Tesi.pdf accesso solo da BNCF e BNCR Licenza: Tutti i diritti riservati Dimensione 3.23 MB Formato Adobe PDF	3.23 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/214115

Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA2-214115