Generalized discrimination discovery on semi-structured data supported by ontology

Luong Thanh, Binh

Recently, data mining has been deemed to be an effective means for disclosing evidences and hidden causes of discrimination. If data mining succeeds in finding associations proving the fact that discriminatory treatments has strong relations with sensitive attributes, discrimination is obviously irrefutable. In this thesis, I propose a modified approach of the traditional data mining process to unveil and represent discrimination in a “rich semantic” form for semi-structured business data with multiple-valued treatments based on support from ontology. First, input data are preprocessed to be well-structured with semantic relations, which considerably support discrimination exploration later. The framework then seeks possibly discriminatory relations between the unequal treatments and protected-by-law attributes, e.g., race, religion, sex. These discriminatory relations will be represented in the form of association rules through the notion of matching pairs of itemsets with different sensitive attributes and equal non-sensitive ones that are subject to different treatments. By combining data mining and reasoning service over the ontology, the achieved rules are semantically enriched by object properties between classes (concepts). Thus, they are more valuable and interesting than the flat association rules. In order to address the drawback of local knowledge, the solution of “kNN as Situation Testing” is provided. Besides, a number of measures of discrimination are provided for the purpose of quantifying the level of discrimination to obtain a precise vision of how different sensitive attributes negatively affect the decision and even on each other. Experimental results confirm the potential and flexibility of the approach.

Generalized discrimination discovery on semi-structured data supported by ontology

Luong Thanh, Binh

2011

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2011
			
	Lingua
	
				Inglese
			
	Parola chiave
	
				QA75 Electronic computers. Computer science
			
	Relatore, Supervisor, Advisor o Tutor
	
				Turini, Prof. Franco
			
	Nome Editore
	
				Scuola IMT Alti Studi di Lucca
			
	Collezione di appartenenza
	
				Scuola IMT Alti Studi di Lucca

File in questo prodotto:

File	Dimensione	Formato
Luong_Thanh_phdthesis.pdf accesso aperto Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati Dimensione 4.53 MB Formato Adobe PDF Visualizza/Apri	4.53 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/144173

Il codice NBN di questa tesi è URN:NBN:IT:IMTLUCCA-144173