Graphs are ubiquitous. They spawn from domains such as the world wide web, social networks, biological phenomena, roads and transportation, and are found in arguably all scientific fields. Graphs that model real-world phenomena can be extremely large, with up to billions of vertices and edges, and can store extensive amounts of information, both structural and semantic. A huge body of literature is dedicated to analyzing such networks to discover information, but this is a challenging task, due to their linked nature and their size. Basic forms of graph analysis study some properties of graphs such as density or diameter. This can provide important information on the gross shape of a graph, but may not capture important details of substructures in a graph: while two similar graphs usually have similar properties, two graphs with similar properties may not be similar at all. In this thesis we will consider the enumeration of subgraphs with desired properties in the context of real-world networks. The analysis of subgraphs is a way to uncover this high quality information, which allows deeper understanding of the data and is a starting point for operations such as knowledge discovery, clustering and classification. Enumeration, when efficient, can also be a more flexible alternative to optimization algorithms, which comes in handy when the goal of the optimization is not clear or changes over time. The goal of this thesis is on one hand to provide useful tools for network analysis and, on the other hand, to give a deeper understanding of algorithm design techniques that are reliable in this context, and how they can be applied effectively. Finally, we take a step beyond enumeration and consider the problem of dealing with the potentially large number of solutions of an enumeration algorithm, reducing their redundancy and improving their significance in some real-world case studies.

Enumeration Algorithms for Real-World Networks: Efficiency and Beyond

2018

Abstract

Graphs are ubiquitous. They spawn from domains such as the world wide web, social networks, biological phenomena, roads and transportation, and are found in arguably all scientific fields. Graphs that model real-world phenomena can be extremely large, with up to billions of vertices and edges, and can store extensive amounts of information, both structural and semantic. A huge body of literature is dedicated to analyzing such networks to discover information, but this is a challenging task, due to their linked nature and their size. Basic forms of graph analysis study some properties of graphs such as density or diameter. This can provide important information on the gross shape of a graph, but may not capture important details of substructures in a graph: while two similar graphs usually have similar properties, two graphs with similar properties may not be similar at all. In this thesis we will consider the enumeration of subgraphs with desired properties in the context of real-world networks. The analysis of subgraphs is a way to uncover this high quality information, which allows deeper understanding of the data and is a starting point for operations such as knowledge discovery, clustering and classification. Enumeration, when efficient, can also be a more flexible alternative to optimization algorithms, which comes in handy when the goal of the optimization is not clear or changes over time. The goal of this thesis is on one hand to provide useful tools for network analysis and, on the other hand, to give a deeper understanding of algorithm design techniques that are reliable in this context, and how they can be applied effectively. Finally, we take a step beyond enumeration and consider the problem of dealing with the potentially large number of solutions of an enumeration algorithm, reducing their redundancy and improving their significance in some real-world case studies.
13-apr-2018
Italiano
Grossi, Roberto
Patrignani, Maurizio
Tomita, Etsuji
Creignou, Nadia
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
Alessio_Conte_PhD_Thesis_revised.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 3.5 MB
Formato Adobe PDF
3.5 MB Adobe PDF Visualizza/Apri
Alessio_Conte_Report_Attivita.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 106.19 kB
Formato Adobe PDF
106.19 kB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/146590
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-146590