At the core of modern Artificial Intelligence (AI) are Neural Networks, complex computational models that take inspiration from the interconnected structure of neurons in the brain, which underpins natural intelligence. Neural networks extract meaningful patterns from data by first projecting them into high-dimensional latent representations, which then can be leveraged to solve a variety of downstream tasks. In doing so, mimicking the brain, they rely on the interplay between several interconnected computational units that, taken individually, perform simple operations. However, when observed at the full model scale, their collective behavior can result in the onset of unforeseen emergent properties. Understanding these properties could lead to enhancements in the performance and reliability of neural networks and it is an open research problem in AI interpretability. In this thesis, our aim is to investigate some of these properties, by analyzing the latent representations learned by neural networks.

At the core of modern Artificial Intelligence (AI) are Neural Networks, complex computational models that take inspiration from the interconnected structure of neurons in the brain, which underpins natural intelligence. Neural networks extract meaningful patterns from data by first projecting them into high-dimensional latent representations, which then can be leveraged to solve a variety of downstream tasks. In doing so, mimicking the brain, they rely on the interplay between several interconnected computational units that, taken individually, perform simple operations. However, when observed at the full model scale, their collective behavior can result in the onset of unforeseen emergent properties. Understanding these properties could lead to enhancements in the performance and reliability of neural networks and it is an open research problem in AI interpretability. In this thesis, our aim is to investigate some of these properties, by analyzing the latent representations learned by neural networks.

A Representation Learning Perspective on some Emergent Properties of Neural Networks

BASILE, LORENZO
2025

Abstract

At the core of modern Artificial Intelligence (AI) are Neural Networks, complex computational models that take inspiration from the interconnected structure of neurons in the brain, which underpins natural intelligence. Neural networks extract meaningful patterns from data by first projecting them into high-dimensional latent representations, which then can be leveraged to solve a variety of downstream tasks. In doing so, mimicking the brain, they rely on the interplay between several interconnected computational units that, taken individually, perform simple operations. However, when observed at the full model scale, their collective behavior can result in the onset of unforeseen emergent properties. Understanding these properties could lead to enhancements in the performance and reliability of neural networks and it is an open research problem in AI interpretability. In this thesis, our aim is to investigate some of these properties, by analyzing the latent representations learned by neural networks.
16-gen-2025
Inglese
At the core of modern Artificial Intelligence (AI) are Neural Networks, complex computational models that take inspiration from the interconnected structure of neurons in the brain, which underpins natural intelligence. Neural networks extract meaningful patterns from data by first projecting them into high-dimensional latent representations, which then can be leveraged to solve a variety of downstream tasks. In doing so, mimicking the brain, they rely on the interplay between several interconnected computational units that, taken individually, perform simple operations. However, when observed at the full model scale, their collective behavior can result in the onset of unforeseen emergent properties. Understanding these properties could lead to enhancements in the performance and reliability of neural networks and it is an open research problem in AI interpretability. In this thesis, our aim is to investigate some of these properties, by analyzing the latent representations learned by neural networks.
Deep Learning; Neural Networks; Representations; Emergent properties; Interpretability
BORTOLUSSI, LUCA
Università degli Studi di Trieste
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/188298
Il codice NBN di questa tesi è URN:NBN:IT:UNITS-188298