Distributed Graph Processing: Algorithms And Applications

Lulli, Alessandro

Thinking Like A Vertex (TLAV) is a popular computational paradigm suitable to express many distributed and iterative graph algorithms. It has been adopted as base computational paradigm for many of the currently available distributed frameworks and endorsed by numerous industries and academias. Also, it has been exploited to define algorithms to extract useful information from the nowadays increasing production of data which can be modeled as graphs. These facts strengthen the idea that exploiting distributed frameworks for graph analysis is an hot topic of research. As a matter of fact, we found that a solution for several algorithms is not always available or state-of-art algorithms are unsatisfactory, under many points of view. This thesis aims at providing guidelines for defining distributed graph algorithms structured according to TLAV and showing their applicability to real applications. We show how approximation, simplification and versatility can be combined to define novel distributed algorithms to improve the currently available solutions with the goal to enhance the functionalities of the algorithms. We show also how algorithms may be combined to define complex solutions and how they can be employed to solve relevant applications. In particular, we present novel algorithms for computing betweenness centrality, connected components and clustering. Such algorithms are exploited for Spam campaign detection, population estimation and hashtag centrality. To this end, we make use of real large dataset provided from our collaborations, Symantec for Spam emails, a large Italian Mobile Phone provider, mobile calls, and ISTI, CNR for a two years collection of real tweets from the Twitter social network.

Distributed Graph Processing: Algorithms And Applications

LULLI, ALESSANDRO

2017

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				6-apr-2017
			
	Lingua
	
				Italiano
			
	Parola chiave
	
				apache spark
clustering
connected components
distributed processing
graph
			
	Relatore, Supervisor, Advisor o Tutor
	
				Ricci, Laura Emilia Maria
Dazzi, Patrizio
			
	Collezione di appartenenza
	
				Università degli Studi di Pisa

File in questo prodotto:

File	Dimensione	Formato
AlessandroLulli_PhDThesis.pdf accesso aperto Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati Dimensione 5.22 MB Formato Adobe PDF Visualizza/Apri	5.22 MB	Adobe PDF	Visualizza/Apri
reportborsista_finale.pdf accesso aperto Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati Dimensione 346.23 kB Formato Adobe PDF Visualizza/Apri	346.23 kB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/133896

Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-133896