The increasing amount of data available in all sectors is raising the need for decision makers to perform sophisticated analyses for dealing with today's high competitive world. Several databases are needed for decision-makers in order to be able to analyze an organization as a whole. These data sources are often scattered, and not uniform among each other in content and format. Their integration is crucial for the decision-making process, and advanced analyses are needed for such a crucial task. This problem may be solved by the data warehousing approach. Data warehouses can be queried and analyzed by means of Online Analytical Processing (OLAP) and Data Mining tools. Decision support systems have been recently dedicated to medical applications. Conventional multidimensional approaches prove not to suffice clinical domain requirements in terms of representation and advanced temporal support. Time is an important and pervasive concept of the real world that needs to be adequately modeled. Indeed, clinical domains are characterized by several temporal aspects. For instance, therapies may be characterized by a start, an end, a first drug administration dates, and so on. In this thesis we first deal with the design and development of a business intelligence solution for pharmacovigilance tasks. Such a system, called VigiSegn, has been created in the context of a project in collaboration the Italian Ministry of Health on drugs surveillance over the Italian territory. We focus on domain expert needs for analyzing and assessing suspected adverse drug reaction cases. Such needs were not satisfied by current data models. We address advanced modeling aspects for multidimensional structures by paying particular attention to data temporal features. We provide a formal definition of a multidimensional model for representing complex facts, addressing the issue of adequately represent interactions between multidimensional cubes. We provide a further extension of the proposed model by underlying the importance of considering both point-based and interval-based semantics when analyzing temporal data. This include advanced interval based temporal operations, and trend discovery. We also provide a sound data mining algorithm. The attention is focused on mining (approximate) temporal functional dependencies based on a temporal grouping of tuples.

Temporal Data Analysis and Mining. A Multidimensional Approach and its Application in a Medical Domain

SABAINI, Alberto
2015

Abstract

The increasing amount of data available in all sectors is raising the need for decision makers to perform sophisticated analyses for dealing with today's high competitive world. Several databases are needed for decision-makers in order to be able to analyze an organization as a whole. These data sources are often scattered, and not uniform among each other in content and format. Their integration is crucial for the decision-making process, and advanced analyses are needed for such a crucial task. This problem may be solved by the data warehousing approach. Data warehouses can be queried and analyzed by means of Online Analytical Processing (OLAP) and Data Mining tools. Decision support systems have been recently dedicated to medical applications. Conventional multidimensional approaches prove not to suffice clinical domain requirements in terms of representation and advanced temporal support. Time is an important and pervasive concept of the real world that needs to be adequately modeled. Indeed, clinical domains are characterized by several temporal aspects. For instance, therapies may be characterized by a start, an end, a first drug administration dates, and so on. In this thesis we first deal with the design and development of a business intelligence solution for pharmacovigilance tasks. Such a system, called VigiSegn, has been created in the context of a project in collaboration the Italian Ministry of Health on drugs surveillance over the Italian territory. We focus on domain expert needs for analyzing and assessing suspected adverse drug reaction cases. Such needs were not satisfied by current data models. We address advanced modeling aspects for multidimensional structures by paying particular attention to data temporal features. We provide a formal definition of a multidimensional model for representing complex facts, addressing the issue of adequately represent interactions between multidimensional cubes. We provide a further extension of the proposed model by underlying the importance of considering both point-based and interval-based semantics when analyzing temporal data. This include advanced interval based temporal operations, and trend discovery. We also provide a sound data mining algorithm. The attention is focused on mining (approximate) temporal functional dependencies based on a temporal grouping of tuples.
2015
Inglese
Temporal data mining; Temporal clinical data warehouses; data warehouse; Data model
147
File in questo prodotto:
File Dimensione Formato  
PhDthesis.pdf

accesso solo da BNCF e BNCR

Dimensione 2.12 MB
Formato Adobe PDF
2.12 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/181027
Il codice NBN di questa tesi è URN:NBN:IT:UNIVR-181027