The cost of moving data is becoming a dominant factor for performance and energy efficiency in high-performance computing systems. To minimize data movement, applications have to consider data placement in order to optimize data transfer between processing units. To address this scenario, new compiler techniques, tools, libraries and programming abstractions are necessary for harnessing data locality. The goal of this thesis is to offer suitable solutions to the challenging problems of data distribution and locality in largescale high-performance computing. To this end, we have developed new programming primitives for two partitioned data space languages, namely, Klaim and X10. Abstractions for partitions and data items are called tuple spaces and tuples in Klaim, and places and objects in X10. As a result, we designed two languages, RepliKlaim and SharedX10 which enrich Klaim and X10 with new primitives for data sharing. Our approach aims at allowing programmers to specify and coordinate shared data and, specifically, to replicate shared data items while taking into account desired consistency properties. Programmers can exploit such flexible mechanisms to adapt data distribution and locality to the desired levels, e.g., to improve performance in terms of concurrency and data access. We investigate issues related to replica consistency and provide analysis of performance and programmability, including several applications from large scale graph analytics

Programming Abstractions for Data Sharing in Distributed Spaces

2017

Abstract

The cost of moving data is becoming a dominant factor for performance and energy efficiency in high-performance computing systems. To minimize data movement, applications have to consider data placement in order to optimize data transfer between processing units. To address this scenario, new compiler techniques, tools, libraries and programming abstractions are necessary for harnessing data locality. The goal of this thesis is to offer suitable solutions to the challenging problems of data distribution and locality in largescale high-performance computing. To this end, we have developed new programming primitives for two partitioned data space languages, namely, Klaim and X10. Abstractions for partitions and data items are called tuple spaces and tuples in Klaim, and places and objects in X10. As a result, we designed two languages, RepliKlaim and SharedX10 which enrich Klaim and X10 with new primitives for data sharing. Our approach aims at allowing programmers to specify and coordinate shared data and, specifically, to replicate shared data items while taking into account desired consistency properties. Programmers can exploit such flexible mechanisms to adapt data distribution and locality to the desired levels, e.g., to improve performance in terms of concurrency and data access. We investigate issues related to replica consistency and provide analysis of performance and programmability, including several applications from large scale graph analytics
lug-2017
Inglese
QA75 Electronic computers. Computer science
De Nicola, Prof. Rocco
Scuola IMT Alti Studi di Lucca
File in questo prodotto:
File Dimensione Formato  
Andric_phdthesis.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 1.65 MB
Formato Adobe PDF
1.65 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/137379
Il codice NBN di questa tesi è URN:NBN:IT:IMTLUCCA-137379