Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields. Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML). In this work, we investigate the application of GP methods in the three aforementioned topics. Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design. Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE). Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity. Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR). Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population. Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.

Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields. Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML). In this work, we investigate the application of GP methods in the three aforementioned topics. Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design. Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE). Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity. Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR). Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population. Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.

Genetic Programming Across Domains: Leveraging Evolutionary Computation to Address Practical Problems

ROVITO, LUIGI
2025

Abstract

Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields. Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML). In this work, we investigate the application of GP methods in the three aforementioned topics. Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design. Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE). Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity. Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR). Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population. Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.
27-gen-2025
Inglese
Several modern technologies are based on Artificial Intelligence (AI) techniques, which are continuously studied and refined in many research fields. Among the sub-fields of AI, bio-inspired approaches, such as Genetic Programming (GP), have found great interest because of their effectiveness in discrete optimization problems that are prevalent in fields such as cryptography, code improvement, and Interpretable Machine Learning (IML). In this work, we investigate the application of GP methods in the three aforementioned topics. Firstly, we employ GP to discover non-linear Boolean functions by exploring the search space of their Walsh transform-based representations for stream ciphers design. Secondly, we try to improve the correctness of code generated by a Large Language Model (LLM) by leveraging a Genetic Improvement (GI) approach that, internally, employs a GP method defined as Grammatical Evolution (GE). Thirdly, we propose a human-in-the-loop GP framework to evolve generic tree-based Machine Learning (ML) models that are evaluated in terms of both qualitative performance and interpretability, where the latter is estimated by an Artificial Neural Network (ANN) that is trained with user feedback to capture user's subjectivity. Finally, we study how GP itself can be improved on its most famous problem, that is, Symbolic Regression (SR). Specifically, we extend a GP variant called Geometric Semantic Genetic Programming (GSGP) to develop Cellular Geometric Semantic Genetic Programming (cGSGP), which improves the diversity of the solutions evolved during the optimization process by imposing a neighborhood-based toroidal structure over the population. Our results highlight the effectiveness of GP in tackling the discussed problems, especially as regards building interpretable ML models, which is a urgent problem in high-stakes applications.
Genetic Programming; Machine Learning; AI; Explainable AI; Optimization
DE LORENZO, ANDREA
Università degli Studi di Trieste
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/189088
Il codice NBN di questa tesi è URN:NBN:IT:UNITS-189088