Computer systems are the basis for daily human activities, and, even more importantly, they play a key role in many critical domains. For this reason, understanding the failure behavior of computer systems is crucial to engineers. Event logs, i.e., the set of files where computing entities register events related to regular and anomalous activities occurred during the system operational phase, represent a valuable source of data to conduct a failure analysis. Study based on event logs span over the past three decades; however, computer systems have deeply changed over this timeframe. Investigating the suitability of traditional assumptions and techniques underlying log-based failure analysis, in spite of the changes occurred in the computer systems industry, is of paramount importance. The focus of the thesis is to evaluate the accuracy of current logging mechanisms at reporting failures, and to develop novel techniques to make event logs effective to infer failure data. Techniques involve production, collection, and correlation of the failure data in the log to support accurate system dependability characterization. The benefits that can be achieved by adopting proposed techniques, are shown by means of experiments conducted in the context of real-world, complex distributed systems.

On the use of event logs for the analysis of system failures

2011

Abstract

Computer systems are the basis for daily human activities, and, even more importantly, they play a key role in many critical domains. For this reason, understanding the failure behavior of computer systems is crucial to engineers. Event logs, i.e., the set of files where computing entities register events related to regular and anomalous activities occurred during the system operational phase, represent a valuable source of data to conduct a failure analysis. Study based on event logs span over the past three decades; however, computer systems have deeply changed over this timeframe. Investigating the suitability of traditional assumptions and techniques underlying log-based failure analysis, in spite of the changes occurred in the computer systems industry, is of paramount importance. The focus of the thesis is to evaluate the accuracy of current logging mechanisms at reporting failures, and to develop novel techniques to make event logs effective to infer failure data. Techniques involve production, collection, and correlation of the failure data in the log to support accurate system dependability characterization. The benefits that can be achieved by adopting proposed techniques, are shown by means of experiments conducted in the context of real-world, complex distributed systems.
2011
it
File in questo prodotto:
File Dimensione Formato  
pecchia_antonio_24.pdf

accesso solo da BNCF e BNCR

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati
Dimensione 5.5 MB
Formato Adobe PDF
5.5 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/338297
Il codice NBN di questa tesi è URN:NBN:IT:BNCF-338297