This thesis explores the theory of associative neural networks through the unifying framework of statistical mechanics, aiming to uncover how collective intelligence emerges from microscopic interactions. At its heart lies a simple yet profound question: how can principles from physics and mathematics illuminate the mechanisms by which neural systems—biological or artificial—store, retrieve, and learn information? The investigation begins from the Hopfield model, a cornerstone in theoretical neuroscience and machine learning. The Hopfield network demonstrated that neural computation could be understood in thermodynamic terms: patterns correspond to attractors of an energy landscape, and retrieval emerges as a collective process of relaxation. However, despite its elegance, the model faces well-known limitations—restricted memory capacity, the proliferation of spurious states, and the absence of true learning. This thesis turns these weaknesses into opportunities, reinterpreting and extending the Hopfield paradigm along three major conceptual directions: richer interactions, data-driven learning, and multi-layer architectures. First, we generalize the classical Hebbian prescription by introducing higher-order synaptic couplings, leading to dense associative networks. Using tools from the statistical mechanics of disordered systems—such as Guerra’s interpolation and replica-symmetry-breaking analysis—we derive their thermodynamic behavior and reveal new computational regimes characterized by enhanced storage capacity and robustness to noise. These dense models naturally connect to modern energy-based and attention-like architectures, bridging the gap between theoretical physics and contemporary artificial intelligence. Second, we endow associative memories with the ability to learn. Instead of encoding idealized patterns, the network is trained on corrupted examples and must infer the hidden archetypes that generated them. Within this framework, learning and retrieval become two sides of the same process, both governed by the minimization of an energy function. The resulting theory provides a quantitative bridge between classical statistical mechanics and modern machine learning, and its predictions are confirmed by experiments on structured datasets such as MNIST and Fashion-MNIST. Finally, we extend associative memory beyond the auto-associative setting, introducing heterogeneous, multi-layer networks that we call multi-directional associative memories. In these architectures, several interacting layers cooperate to perform complex cognitive tasks such as cross-modal recall, pattern disentanglement, and reconstruction of temporal sequences. A remarkable emergent feature—cooperativeness-arises: layers can assist one another during learning, leading to collective performance superior to that of any single component. Moreover, a refined variant, the so-called “hard” associative network, is shown to disentangle the classical spurious mixtures of Hopfield systems and to reconstruct the hidden patterns directly from their Hebbian coupling matrix. Beyond their analytical and algorithmic contributions, these results convey a broader message. They demonstrate that the language of statistical mechanics—order parameters, free energies, and phase transitions—remains a powerful and predictive tool for understanding intelligent behavior. By weaving together ideas from physics, mathematics, and computer science, the thesis provides a coherent framework where associative memories become a testing ground for a unified science of learning: one that is mathematically rigorous, physically interpretable, and algorithmically effective.
From Dense to Heterogeneous Networks: Unveiling the Power of Interactions
ALESSANDRELLI, ANDREA
2026
Abstract
This thesis explores the theory of associative neural networks through the unifying framework of statistical mechanics, aiming to uncover how collective intelligence emerges from microscopic interactions. At its heart lies a simple yet profound question: how can principles from physics and mathematics illuminate the mechanisms by which neural systems—biological or artificial—store, retrieve, and learn information? The investigation begins from the Hopfield model, a cornerstone in theoretical neuroscience and machine learning. The Hopfield network demonstrated that neural computation could be understood in thermodynamic terms: patterns correspond to attractors of an energy landscape, and retrieval emerges as a collective process of relaxation. However, despite its elegance, the model faces well-known limitations—restricted memory capacity, the proliferation of spurious states, and the absence of true learning. This thesis turns these weaknesses into opportunities, reinterpreting and extending the Hopfield paradigm along three major conceptual directions: richer interactions, data-driven learning, and multi-layer architectures. First, we generalize the classical Hebbian prescription by introducing higher-order synaptic couplings, leading to dense associative networks. Using tools from the statistical mechanics of disordered systems—such as Guerra’s interpolation and replica-symmetry-breaking analysis—we derive their thermodynamic behavior and reveal new computational regimes characterized by enhanced storage capacity and robustness to noise. These dense models naturally connect to modern energy-based and attention-like architectures, bridging the gap between theoretical physics and contemporary artificial intelligence. Second, we endow associative memories with the ability to learn. Instead of encoding idealized patterns, the network is trained on corrupted examples and must infer the hidden archetypes that generated them. Within this framework, learning and retrieval become two sides of the same process, both governed by the minimization of an energy function. The resulting theory provides a quantitative bridge between classical statistical mechanics and modern machine learning, and its predictions are confirmed by experiments on structured datasets such as MNIST and Fashion-MNIST. Finally, we extend associative memory beyond the auto-associative setting, introducing heterogeneous, multi-layer networks that we call multi-directional associative memories. In these architectures, several interacting layers cooperate to perform complex cognitive tasks such as cross-modal recall, pattern disentanglement, and reconstruction of temporal sequences. A remarkable emergent feature—cooperativeness-arises: layers can assist one another during learning, leading to collective performance superior to that of any single component. Moreover, a refined variant, the so-called “hard” associative network, is shown to disentangle the classical spurious mixtures of Hopfield systems and to reconstruct the hidden patterns directly from their Hebbian coupling matrix. Beyond their analytical and algorithmic contributions, these results convey a broader message. They demonstrate that the language of statistical mechanics—order parameters, free energies, and phase transitions—remains a powerful and predictive tool for understanding intelligent behavior. By weaving together ideas from physics, mathematics, and computer science, the thesis provides a coherent framework where associative memories become a testing ground for a unified science of learning: one that is mathematically rigorous, physically interpretable, and algorithmically effective.| File | Dimensione | Formato | |
|---|---|---|---|
|
PhDAIThesis_Alessandrelli.pdf
accesso aperto
Licenza:
Creative Commons
Dimensione
15.55 MB
Formato
Adobe PDF
|
15.55 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/361566
URN:NBN:IT:UNIPI-361566