Genetic Programming Across Domains: Leveraging Evolutionary Computation to Address Practical Problems

Rovito, Luigi

Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields. Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML). In this work, we investigate the application of GP methods in the three aforementioned topics. Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design. Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE). Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity. Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR). Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population. Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.

Genetic Programming Across Domains: Leveraging Evolutionary Computation to Address Practical Problems

ROVITO, LUIGI

2025

Abstract

Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields. Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML). In this work, we investigate the application of GP methods in the three aforementioned topics. Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design. Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE). Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity. Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR). Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population. Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				APPLIED DATA SCIENCE AND ARTIFICIAL INTELLIGENCE
			
	Data di pubblicazione
	
				27-gen-2025
			
	Lingua
	
				Inglese
			
	Abstract in italiano
	
				Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields.
Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML).
In this work, we investigate the application of GP methods in the three aforementioned topics.
Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design.
Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE).
Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity.
Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR).
Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population.
Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.
			
	Parola chiave
	
				Genetic Programming; Machine Learning; AI; Explainable AI; Optimization
			
	Relatore, Supervisor, Advisor o Tutor
	
				DE LORENZO, ANDREA
			
	Nome Editore
	
				Università degli Studi di Trieste
			
	Collezione di appartenenza
	
				Università degli Studi di Trieste

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/189088

Il codice NBN di questa tesi è URN:NBN:IT:UNITS-189088