Vision-based deep learning approaches for post-earthquake building damage assessment

Saquella, Simone

This doctoral thesis tackles a challenge in earthquake engineering: the timely and reliable assessment of damage to buildings after an earthquake. Traditional evaluations often rely on visual inspections conducted by structural engineers using standardized forms, such as the Italian AeDES protocol. While these methods are necessary, they tend to be time-consuming, subjective, and hard to apply during large seismic events. Driven by these challenges, this thesis examines how vision-based Artificial Intelligence (AI), specifically Convolutional Neural Networks (CNNs), can improve the post-earthquake damage assessment process. The main goals of this research are twofold: (i) to evaluate the performance of various modern CNN architectures for seismic damage classification in buildings, with a focus on the VGG16 model, and (ii) to create and validate a new dataset designed for earthquake damage classification, enhancing existing collections. The contribution of this new proposed dataset is a key part of this work. Previous datasets, like PEER Φ-Net, ReLUIS, and the INGV photographic database, provided useful starting points but faced issues like class imbalances, inconsistent labeling criteria, and discrepancies across different sources. To address these problems, a new dataset was created. Structural engineers re-annotated it following the AeDES guidelines and expanded it through a systematic process that included geometric transformations and oversampling. The final balanced dataset contains nearly 20,000 labeled samples, covering four damage states (None, Slight, Moderate, Heavy) across various building types and structural elements. This dataset not only supports the experiments in this thesis but also serves as a valuable resource for future research in automated post-disaster assessments. On the methodological side, the thesis has first gone through the problem of CNN efficacy in a comparative study of different well-known architectures like AlexNet, DenseNet, ResNet, EfficientNet-B0, and particularly VGG16. Results have corroborated the strength of CNN-based methods in the identification of earthquake-caused damage to be stable, with accuracies always exceeding 80% and the best-performing setups getting close to 90% on the curated dataset. VGG16, even though it is one of the earlier deep architectures, turns out to be the most powerful model when the correct data augmentation and transfer learning strategies are applied. What these outcomes emphasize is that in the case of relatively small and highly specialized datasets as seismic imagery, the choice of a less complex deep learning model along with the development of a carefully curated dataset can give better results than the use of more complex and deeper networks. Moreover, the thesis goes beyond the CNN benchmarking to consider the implementation of more sophisticated data-fusion strategies for the enrichment of the classification process. To facilitate the extraction of high-frequency coefficients representing edges and the local texture details of the targeted images the Discrete Wavelet Transform (DWT) was introduced. These coefficients were then mixed with RGB images through early and intermediate fusion schemes. This approach proved to be the most promising in terms of accuracy with performance improved almost to 87% as well as better recognition of faint damage levels such as "Moderate" that usually are barely differentiable. This proves to be the perfect example of how spatial and spectral representations can be successfully combined for damage recognition in structures using computer-vision. In addition, it was explored the potential of Transformer-based architectures: in particular, four variants of the Vision Transformer (ViT) were implemented and benchmarked, reaching good performance with both accuracy and F1-score peaking at 88%, thereby surpassing most CNN-based approaches on the curated dataset. Another consistent portion of this thesis was marked by the incorporation of AI-based classification alongside seismic capacity assessment. The conceptual framework was only briefly mentioned and later explained with a small case study: damage factors derived from CNN were converted into mechanical reduction factors of the stiffness, strength and ductility properties of the structure according to the guidelines of FEMA 306, and were thus usable for nonlinear structural analyses. As a result, the updated pushover curves for buildings with different damage levels were obtained, which enabled the residual seismic capacity to be estimated. Experimental situations involving earthquakes were used to test the performance of the method. The results indicate that automated visual classification can provide valuable input for structural performance evaluation. Hence, this study moves on to the next level of decision making in the field of structural engineering, which is the evaluation of safety after the quake by employing a quantitative method. Overall, the findings of this research underline that the combination of AI-based vision techniques and structural mechanics can significantly accelerate, standardize, and objectify post-earthquake assessment workflows. The results also show that the construction of a tailored and rigorously annotated dataset is at least as important as model selection, confirming that data quality is a primary driver of performance in AI-based earthquake engineering. This thesis, therefore, contributes both a methodological framework and a novel dataset, laying the groundwork for future developments in AI-augmented seismic risk management and digital-twin integration.

Vision-based deep learning approaches for post-earthquake building damage assessment

SAQUELLA, SIMONE

2026

Abstract

This doctoral thesis tackles a challenge in earthquake engineering: the timely and reliable assessment of damage to buildings after an earthquake. Traditional evaluations often rely on visual inspections conducted by structural engineers using standardized forms, such as the Italian AeDES protocol. While these methods are necessary, they tend to be time-consuming, subjective, and hard to apply during large seismic events. Driven by these challenges, this thesis examines how vision-based Artificial Intelligence (AI), specifically Convolutional Neural Networks (CNNs), can improve the post-earthquake damage assessment process. The main goals of this research are twofold: (i) to evaluate the performance of various modern CNN architectures for seismic damage classification in buildings, with a focus on the VGG16 model, and (ii) to create and validate a new dataset designed for earthquake damage classification, enhancing existing collections. The contribution of this new proposed dataset is a key part of this work. Previous datasets, like PEER Φ-Net, ReLUIS, and the INGV photographic database, provided useful starting points but faced issues like class imbalances, inconsistent labeling criteria, and discrepancies across different sources. To address these problems, a new dataset was created. Structural engineers re-annotated it following the AeDES guidelines and expanded it through a systematic process that included geometric transformations and oversampling. The final balanced dataset contains nearly 20,000 labeled samples, covering four damage states (None, Slight, Moderate, Heavy) across various building types and structural elements. This dataset not only supports the experiments in this thesis but also serves as a valuable resource for future research in automated post-disaster assessments. On the methodological side, the thesis has first gone through the problem of CNN efficacy in a comparative study of different well-known architectures like AlexNet, DenseNet, ResNet, EfficientNet-B0, and particularly VGG16. Results have corroborated the strength of CNN-based methods in the identification of earthquake-caused damage to be stable, with accuracies always exceeding 80% and the best-performing setups getting close to 90% on the curated dataset. VGG16, even though it is one of the earlier deep architectures, turns out to be the most powerful model when the correct data augmentation and transfer learning strategies are applied. What these outcomes emphasize is that in the case of relatively small and highly specialized datasets as seismic imagery, the choice of a less complex deep learning model along with the development of a carefully curated dataset can give better results than the use of more complex and deeper networks. Moreover, the thesis goes beyond the CNN benchmarking to consider the implementation of more sophisticated data-fusion strategies for the enrichment of the classification process. To facilitate the extraction of high-frequency coefficients representing edges and the local texture details of the targeted images the Discrete Wavelet Transform (DWT) was introduced. These coefficients were then mixed with RGB images through early and intermediate fusion schemes. This approach proved to be the most promising in terms of accuracy with performance improved almost to 87% as well as better recognition of faint damage levels such as "Moderate" that usually are barely differentiable. This proves to be the perfect example of how spatial and spectral representations can be successfully combined for damage recognition in structures using computer-vision. In addition, it was explored the potential of Transformer-based architectures: in particular, four variants of the Vision Transformer (ViT) were implemented and benchmarked, reaching good performance with both accuracy and F1-score peaking at 88%, thereby surpassing most CNN-based approaches on the curated dataset. Another consistent portion of this thesis was marked by the incorporation of AI-based classification alongside seismic capacity assessment. The conceptual framework was only briefly mentioned and later explained with a small case study: damage factors derived from CNN were converted into mechanical reduction factors of the stiffness, strength and ductility properties of the structure according to the guidelines of FEMA 306, and were thus usable for nonlinear structural analyses. As a result, the updated pushover curves for buildings with different damage levels were obtained, which enabled the residual seismic capacity to be estimated. Experimental situations involving earthquakes were used to test the performance of the method. The results indicate that automated visual classification can provide valuable input for structural performance evaluation. Hence, this study moves on to the next level of decision making in the field of structural engineering, which is the evaluation of safety after the quake by employing a quantitative method. Overall, the findings of this research underline that the combination of AI-based vision techniques and structural mechanics can significantly accelerate, standardize, and objectify post-earthquake assessment workflows. The results also show that the construction of a tailored and rigorously annotated dataset is at least as important as model selection, confirming that data quality is a primary driver of performance in AI-based earthquake engineering. This thesis, therefore, contributes both a methodological framework and a novel dataset, laying the groundwork for future developments in AI-augmented seismic risk management and digital-twin integration.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Energia e ambiente: innovazione e sostenibilità
			
	Data di pubblicazione
	
				29-gen-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				LANEVE, Giovanni
SCARPINITI, MICHELE
			
	Nome Editore
	
				Università degli Studi di Roma "La Sapienza"
			
	Collezione di appartenenza
	
				Università degli Studi di Roma La Sapienza

File in questo prodotto:

File	Dimensione	Formato
Teso_dottrorto_Saquella.pdf accesso aperto Licenza: Creative Commons Dimensione 14.51 MB Formato Adobe PDF Visualizza/Apri	14.51 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/357256

Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-357256