Efficient Processing of Spiking Neural Networks: A Memory-Based Approach

Abu Lebdeh, Muath Farouq Mustafa

Neuromorphic systems have been developed with the aim of mimicking biological systems in terms of functionality and processing efficiency. Spiking neural networks (SNNs) are widely used as the computing model for the neuromorphic system. The neurons in the SNN communicate using spikes, which allows the SNN to be efficiently implemented on hardware. In addition, the SNNs are based on the spiking neuron, a dynamical neuron that extracts temporal features at the neuron level. On the other hand, the dynamicity of the spiking neuron is implemented using one or more state variables, which introduces a serious memory overhead in terms of memory operations (read/write), runtime memory requirement, and the utilization of hierarchical hardware memory. This memory overhead degrades the SNNs’ processing efficiency and increases their hardware cost. In other words, the memory overhead at the neuron level offsets the computational benefits gained from the spiking communication between neurons at the network level. This memory overhead problem has not been deeply addressed in the field of neuromorphic computing, as the SNNs are still not at large scale compared to the developed neuromorphic hardware. This dissertation aims to address the memory overhead in SNNs to increase their processing efficiency and decrease their hardware cost. The adopted research method focuses mainly on the algorithmic level, that is, on the design of neural network architectures. The design of the neural network architecture is constrained by two main metrics: (1) maintaining the accuracy of the AI task while (2) optimizing their memory efficiency when running on a hardware platform. All proposed solutions narrow the dataflow of the SNN’s computation, which reduces the memory overhead. The research carried out in this dissertation has resulted in four main contributions. The first two contributions are based on the idea of specialization, which is an adopted concept from conventional artificial neural networks (ANNs). Specialization helps to reduce memory overhead when deployed in SNNs. The next two contributions utilize the inherent dynamicity of the spiking neuron to reduce the runtime memory requirement and improve the utilization of hierarchical hardware memory. The proposed solutions allow the SNNs to be more efficient in terms of memory, and hence in terms of the processing efficiency and cost. Task specialization improves the energy and latency of the SNNs inference by at least 10x. Per layer specialization eliminates the training burden of the task specialization approach, while improving the processing efficiency (in terms of latency) by up to 3x. The temporal dynamics of the spiking neuron were analyzed in terms of feature extraction, especially for images. The temporal dynamics of biologically-plausible spiking neurons were exploited to process a whole image using a single spiking neuron. This approach allows for reducing the memory requirement of a convolutional SNN by at least 4x, and optimizes the processing efficiency of the SNN based on the available hardware memory. The approaches and network architectures proposed in this dissertation emphasize the importance of memory in the design of SNNs, increases the processing efficiency of the SNNs, reduces the hardware cost that implements the SNNs, and opens the ground for a new set of application-specific solutions that exploit the temporal dynamics of the spiking neuron.

Efficient Processing of Spiking Neural Networks: A Memory-Based Approach

Abu Lebdeh, Muath Farouq Mustafa

2025

Abstract

Neuromorphic systems have been developed with the aim of mimicking biological systems in terms of functionality and processing efficiency. Spiking neural networks (SNNs) are widely used as the computing model for the neuromorphic system. The neurons in the SNN communicate using spikes, which allows the SNN to be efficiently implemented on hardware. In addition, the SNNs are based on the spiking neuron, a dynamical neuron that extracts temporal features at the neuron level. On the other hand, the dynamicity of the spiking neuron is implemented using one or more state variables, which introduces a serious memory overhead in terms of memory operations (read/write), runtime memory requirement, and the utilization of hierarchical hardware memory. This memory overhead degrades the SNNs’ processing efficiency and increases their hardware cost. In other words, the memory overhead at the neuron level offsets the computational benefits gained from the spiking communication between neurons at the network level. This memory overhead problem has not been deeply addressed in the field of neuromorphic computing, as the SNNs are still not at large scale compared to the developed neuromorphic hardware. This dissertation aims to address the memory overhead in SNNs to increase their processing efficiency and decrease their hardware cost. The adopted research method focuses mainly on the algorithmic level, that is, on the design of neural network architectures. The design of the neural network architecture is constrained by two main metrics: (1) maintaining the accuracy of the AI task while (2) optimizing their memory efficiency when running on a hardware platform. All proposed solutions narrow the dataflow of the SNN’s computation, which reduces the memory overhead. The research carried out in this dissertation has resulted in four main contributions. The first two contributions are based on the idea of specialization, which is an adopted concept from conventional artificial neural networks (ANNs). Specialization helps to reduce memory overhead when deployed in SNNs. The next two contributions utilize the inherent dynamicity of the spiking neuron to reduce the runtime memory requirement and improve the utilization of hierarchical hardware memory. The proposed solutions allow the SNNs to be more efficient in terms of memory, and hence in terms of the processing efficiency and cost. Task specialization improves the energy and latency of the SNNs inference by at least 10x. Per layer specialization eliminates the training burden of the task specialization approach, while improving the processing efficiency (in terms of latency) by up to 3x. The temporal dynamics of the spiking neuron were analyzed in terms of feature extraction, especially for images. The temporal dynamics of biologically-plausible spiking neurons were exploited to process a whole image using a single spiking neuron. This approach allows for reducing the memory requirement of a convolutional SNN by at least 4x, and optimizes the processing efficiency of the SNN based on the available hardware memory. The approaches and network architectures proposed in this dissertation emphasize the importance of memory in the design of SNNs, increases the processing efficiency of the SNNs, reduces the hardware cost that implements the SNNs, and opens the ground for a new set of application-specific solutions that exploit the temporal dynamics of the spiking neuron.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Ingegneria e Scienza dell'Informaz (cess.4/11/12)
			
	Corso di studio
	
				Information and Communication Technology
			
	Data di pubblicazione
	
				30-ott-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Yildirim, Kasim Sinan
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				Brunelli, Davide
			
	Nome Editore
	
				Università degli studi di Trento
			
	Città Editore
	
				TRENTO
			
	Numero di pagine
	
				116
			
	Collezione di appartenenza
	
				Università degli Studi di Trento

File in questo prodotto:

File	Dimensione	Formato
Thesis_Muath-Abu-Lebdeh.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 21.84 MB Formato Adobe PDF Visualizza/Apri	21.84 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/308193

Il codice NBN di questa tesi è URN:NBN:IT:UNITN-308193