As starting point the definition of Risk as the chances of having an unexpected or negative outcome has been introduced. After a brief introduction on most of the risk categories as Banks and regulators, the thesis focuses on credit risk models where the entire financial system is highly investing to avoid a further financial crisis. Among the Credit Risk metrics, Risk Weighed Assets (RWAs) can be considered an important measure in the current credit risk environment. Indeed they represent an aggregated measure of different risk factors affecting the evaluation of financial products. The credit risk model accuracy, as all models, does not depend only on the effectiveness, parametrization and complexity of the model, but from the data used as input. This situation is often summarized as "Garbage IN is equal to Garbage OUT". In the second chapter, several machine learning techniques for data anomalies detection have been introduced with a focus on Local Outlier Factor (LOF) and Isolation Forests. In the third and fourth chapters, these algorithms have been tested first on an artificial sample in order to show their statistical properties and then they have been applied on a real credit risk dataset where RWAs data anomalies have been analyzed. [edited by Author]

Anomalies detection in credit risk data: an approach based on the Isolation Forest

Fabio, Forte
2019

Abstract

As starting point the definition of Risk as the chances of having an unexpected or negative outcome has been introduced. After a brief introduction on most of the risk categories as Banks and regulators, the thesis focuses on credit risk models where the entire financial system is highly investing to avoid a further financial crisis. Among the Credit Risk metrics, Risk Weighed Assets (RWAs) can be considered an important measure in the current credit risk environment. Indeed they represent an aggregated measure of different risk factors affecting the evaluation of financial products. The credit risk model accuracy, as all models, does not depend only on the effectiveness, parametrization and complexity of the model, but from the data used as input. This situation is often summarized as "Garbage IN is equal to Garbage OUT". In the second chapter, several machine learning techniques for data anomalies detection have been introduced with a focus on Local Outlier Factor (LOF) and Isolation Forests. In the third and fourth chapters, these algorithms have been tested first on an artificial sample in order to show their statistical properties and then they have been applied on a real credit risk dataset where RWAs data anomalies have been analyzed. [edited by Author]
16-dic-2019
Inglese
Isolation forest
Machine learning techniques
Credi risk
De Stefanis, Sergio Pietro
NIGLIO, Marcella
Università degli Studi di Salerno
File in questo prodotto:
File Dimensione Formato  
111017012714688674651366804458233835337.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 3.72 MB
Formato Adobe PDF
3.72 MB Adobe PDF Visualizza/Apri
3544054958107791702451539550702413142.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 54.37 kB
Formato Adobe PDF
54.37 kB Adobe PDF Visualizza/Apri
67888687953832373793333756118623622363.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 53.65 kB
Formato Adobe PDF
53.65 kB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/311379
Il codice NBN di questa tesi è URN:NBN:IT:UNISA-311379