Neuromorphic systems have been developed with the aim of mimicking biological systems in terms of functionality and processing efficiency. Spiking neural networks (SNNs) are widely used as the computing model for the neuromorphic system. The neurons in the SNN communicate using spikes, which allows the SNN to be efficiently implemented on hardware. In addition, the SNNs are based on the spiking neuron, a dynamical neuron that extracts temporal features at the neuron level. On the other hand, the dynamicity of the spiking neuron is implemented using one or more state variables, which introduces a serious memory overhead in terms of memory operations (read/write), runtime memory requirement, and the utilization of hierarchical hardware memory. This memory overhead degrades the SNNs’ processing efficiency and increases their hardware cost. In other words, the memory overhead at the neuron level offsets the computational benefits gained from the spiking communication between neurons at the network level. This memory overhead problem has not been deeply addressed in the field of neuromorphic computing, as the SNNs are still not at large scale compared to the developed neuromorphic hardware. This dissertation aims to address the memory overhead in SNNs to increase their processing efficiency and decrease their hardware cost. The adopted research method focuses mainly on the algorithmic level, that is, on the design of neural network architectures. The design of the neural network architecture is constrained by two main metrics: (1) maintaining the accuracy of the AI task while (2) optimizing their memory efficiency when running on a hardware platform. All proposed solutions narrow the dataflow of the SNN’s computation, which reduces the memory overhead. The research carried out in this dissertation has resulted in four main contributions. The first two contributions are based on the idea of specialization, which is an adopted concept from conventional artificial neural networks (ANNs). Specialization helps to reduce memory overhead when deployed in SNNs. The next two contributions utilize the inherent dynamicity of the spiking neuron to reduce the runtime memory requirement and improve the utilization of hierarchical hardware memory. The proposed solutions allow the SNNs to be more efficient in terms of memory, and hence in terms of the processing efficiency and cost. Task specialization improves the energy and latency of the SNNs inference by at least 10x. Per layer specialization eliminates the training burden of the task specialization approach, while improving the processing efficiency (in terms of latency) by up to 3x. The temporal dynamics of the spiking neuron were analyzed in terms of feature extraction, especially for images. The temporal dynamics of biologically-plausible spiking neurons were exploited to process a whole image using a single spiking neuron. This approach allows for reducing the memory requirement of a convolutional SNN by at least 4x, and optimizes the processing efficiency of the SNN based on the available hardware memory. The approaches and network architectures proposed in this dissertation emphasize the importance of memory in the design of SNNs, increases the processing efficiency of the SNNs, reduces the hardware cost that implements the SNNs, and opens the ground for a new set of application-specific solutions that exploit the temporal dynamics of the spiking neuron.
Efficient Processing of Spiking Neural Networks: A Memory-Based Approach
Abu Lebdeh, Muath Farouq Mustafa
2025
Abstract
Neuromorphic systems have been developed with the aim of mimicking biological systems in terms of functionality and processing efficiency. Spiking neural networks (SNNs) are widely used as the computing model for the neuromorphic system. The neurons in the SNN communicate using spikes, which allows the SNN to be efficiently implemented on hardware. In addition, the SNNs are based on the spiking neuron, a dynamical neuron that extracts temporal features at the neuron level. On the other hand, the dynamicity of the spiking neuron is implemented using one or more state variables, which introduces a serious memory overhead in terms of memory operations (read/write), runtime memory requirement, and the utilization of hierarchical hardware memory. This memory overhead degrades the SNNs’ processing efficiency and increases their hardware cost. In other words, the memory overhead at the neuron level offsets the computational benefits gained from the spiking communication between neurons at the network level. This memory overhead problem has not been deeply addressed in the field of neuromorphic computing, as the SNNs are still not at large scale compared to the developed neuromorphic hardware. This dissertation aims to address the memory overhead in SNNs to increase their processing efficiency and decrease their hardware cost. The adopted research method focuses mainly on the algorithmic level, that is, on the design of neural network architectures. The design of the neural network architecture is constrained by two main metrics: (1) maintaining the accuracy of the AI task while (2) optimizing their memory efficiency when running on a hardware platform. All proposed solutions narrow the dataflow of the SNN’s computation, which reduces the memory overhead. The research carried out in this dissertation has resulted in four main contributions. The first two contributions are based on the idea of specialization, which is an adopted concept from conventional artificial neural networks (ANNs). Specialization helps to reduce memory overhead when deployed in SNNs. The next two contributions utilize the inherent dynamicity of the spiking neuron to reduce the runtime memory requirement and improve the utilization of hierarchical hardware memory. The proposed solutions allow the SNNs to be more efficient in terms of memory, and hence in terms of the processing efficiency and cost. Task specialization improves the energy and latency of the SNNs inference by at least 10x. Per layer specialization eliminates the training burden of the task specialization approach, while improving the processing efficiency (in terms of latency) by up to 3x. The temporal dynamics of the spiking neuron were analyzed in terms of feature extraction, especially for images. The temporal dynamics of biologically-plausible spiking neurons were exploited to process a whole image using a single spiking neuron. This approach allows for reducing the memory requirement of a convolutional SNN by at least 4x, and optimizes the processing efficiency of the SNN based on the available hardware memory. The approaches and network architectures proposed in this dissertation emphasize the importance of memory in the design of SNNs, increases the processing efficiency of the SNNs, reduces the hardware cost that implements the SNNs, and opens the ground for a new set of application-specific solutions that exploit the temporal dynamics of the spiking neuron.| File | Dimensione | Formato | |
|---|---|---|---|
|
Thesis_Muath-Abu-Lebdeh.pdf
accesso aperto
Licenza:
Tutti i diritti riservati
Dimensione
21.84 MB
Formato
Adobe PDF
|
21.84 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/308193
URN:NBN:IT:UNITN-308193