ONE OF THE CRITICAL ASPECTS OF CLUSTERING ALGORITHMS IS THE CORRECT IDENTIFICATION OF THE DISSIMILARITY MEASURE USED TO DRIVE THE PARTITIONING OF THE DATA SET. THE DISSIMILARITY MEASURE INDUCES THE CLUSTER SHAPE AND THEREFORE DETERMINES THE SUCCESS OF CLUSTERING ALGORITHMS. AS CLUSTER SHAPES CHANGE FROM A DATA SET TO ANOTHER, DISSIMILARITY MEASURES SHOULD BE EXTRACTED FROM DATA. TO THIS AIM, WE EXPLOIT SOME PAIRS OF POINTS WITH KNOWN DISSIMILARITYVALUETO LEARN ADISSIMILARITYMEASURE.THEN,WEUSETHEDISSIMILARITYMEASURETO GUIDE AN UNSUPERVISED FUZZY RELATIONAL CLUSTERING ALGORITHM. WE APPLY AND COMPARE TWO DIFFERENT METHODS FOR DISSIMILARITY EXTRACTION ON BOTH SYNTHETIC AND REAL DATA SETS. FURTHER, W E DISCUSS THE ADVANTAGES OF USING A NOVEL APPROACH, RECENTLY PROPOSED BY THE AUTHORS, TO RELATIONAL CLUSTERING THAT PARTITIONS THE DATA SET BASED ON THE PROXIMITY OF THE VECTORS CONTAINING THE DISSIMILARITY VALUES BETWEEN EACH PATTERN AND ALL THE OTHER PATTERNSIN THEDATASET. EXPERIMENTALRESULTSSHOW THAT,EVEN WITH ALOW PERCENTAGEOF KNOWN DISSIMILARITIES, THE COMBINATION LEARNING ALGORITHM/FUZZY RELATIONAL CLUSTERING ALGORITHM ALLOWSGENERATING TRUTHFULPARTITIONSOFTHEDATASETS.

CLUSTERING IN NON-METRIC SPACES

2007

Abstract

ONE OF THE CRITICAL ASPECTS OF CLUSTERING ALGORITHMS IS THE CORRECT IDENTIFICATION OF THE DISSIMILARITY MEASURE USED TO DRIVE THE PARTITIONING OF THE DATA SET. THE DISSIMILARITY MEASURE INDUCES THE CLUSTER SHAPE AND THEREFORE DETERMINES THE SUCCESS OF CLUSTERING ALGORITHMS. AS CLUSTER SHAPES CHANGE FROM A DATA SET TO ANOTHER, DISSIMILARITY MEASURES SHOULD BE EXTRACTED FROM DATA. TO THIS AIM, WE EXPLOIT SOME PAIRS OF POINTS WITH KNOWN DISSIMILARITYVALUETO LEARN ADISSIMILARITYMEASURE.THEN,WEUSETHEDISSIMILARITYMEASURETO GUIDE AN UNSUPERVISED FUZZY RELATIONAL CLUSTERING ALGORITHM. WE APPLY AND COMPARE TWO DIFFERENT METHODS FOR DISSIMILARITY EXTRACTION ON BOTH SYNTHETIC AND REAL DATA SETS. FURTHER, W E DISCUSS THE ADVANTAGES OF USING A NOVEL APPROACH, RECENTLY PROPOSED BY THE AUTHORS, TO RELATIONAL CLUSTERING THAT PARTITIONS THE DATA SET BASED ON THE PROXIMITY OF THE VECTORS CONTAINING THE DISSIMILARITY VALUES BETWEEN EACH PATTERN AND ALL THE OTHER PATTERNSIN THEDATASET. EXPERIMENTALRESULTSSHOW THAT,EVEN WITH ALOW PERCENTAGEOF KNOWN DISSIMILARITIES, THE COMBINATION LEARNING ALGORITHM/FUZZY RELATIONAL CLUSTERING ALGORITHM ALLOWSGENERATING TRUTHFULPARTITIONSOFTHEDATASETS.
27-ago-2007
Italiano
Lazzerini, Beatrice
Marcelloni, Francesco
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
PhDThesis_Cimino.pdf

embargo fino al 25/05/2047

Tipologia: Altro materiale allegato
Dimensione 2.28 MB
Formato Adobe PDF
2.28 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/138506
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-138506