Data mining is the analysis of large volumes of data to find unsuspected relationships and to summarize the data in novel ways, that are both understandable and useful to the data owner. Nowadays, the rapid growth of semi-structured sources raises the need of designing and implementing environments for data mining out of XML data. On the basis of the principles of the inductive database theory, this dissertation presents a flexible data mining system with capabilities of obtaining, maintaining, representing and querying induced, deduced and prior knowledge, stored inside native XML databases. In particular, it summarizes our three-years experience in the design and development of XQuake, a query language that extends XQuery to support mining primitives. Features of the language are an intuitive syntax, a good expressiveness, and the capability of dealing uniformly with data mining entities. A detail of its implementation and the evaluation of its performance are also given.

XQuake: an XML-based Knowledge Discovery Environment

2009

Abstract

Data mining is the analysis of large volumes of data to find unsuspected relationships and to summarize the data in novel ways, that are both understandable and useful to the data owner. Nowadays, the rapid growth of semi-structured sources raises the need of designing and implementing environments for data mining out of XML data. On the basis of the principles of the inductive database theory, this dissertation presents a flexible data mining system with capabilities of obtaining, maintaining, representing and querying induced, deduced and prior knowledge, stored inside native XML databases. In particular, it summarizes our three-years experience in the design and development of XQuake, a query language that extends XQuery to support mining primitives. Features of the language are an intuitive syntax, a good expressiveness, and the capability of dealing uniformly with data mining entities. A detail of its implementation and the evaluation of its performance are also given.
25-nov-2009
Italiano
Turini, Franco
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
PHDThesisFinal.pdf

embargo fino al 10/12/2049

Tipologia: Altro materiale allegato
Dimensione 4.72 MB
Formato Adobe PDF
4.72 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/127766
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-127766