In the era of data-driven computing, the exponential increase in data generation, fueled by the proliferation of IoT devices, Big Data Analytics, and AI systems, demands advanced parallel programming solutions capable of leveraging both shared- and distributed-memory architectures. This thesis addresses the urgent need for scalable, portable, and efficient programming tools by presenting a novel run-time system and programming model built upon the FastFlow framework. Designed to support both scale-up and scale-out paradigms, the proposed approach enables developers to efficiently exploit heterogeneous hardware platforms without sacrificing programmability or requiring significant code refactoring. A key innovation is the introduction of distributed groups dgroups, logical subdivisions of FastFlow building blocks that maintain business logic while facilitating flexible computation distribution. To enhance adaptability across diverse environments beyond traditional HPC clusters, the Multi-Transport Communication Library (MTCL) is introduced, offering a unified API for multiple transport protocols. The system’s effectiveness is validated through benchmarks and a set of real-world applications, from decentralized machine learning scenarios to scalable bioinformatics pipelines deployed in distributed environments.
A unified programming model for scale-up and scale-out platforms
TONCI, NICOLO'
2025
Abstract
In the era of data-driven computing, the exponential increase in data generation, fueled by the proliferation of IoT devices, Big Data Analytics, and AI systems, demands advanced parallel programming solutions capable of leveraging both shared- and distributed-memory architectures. This thesis addresses the urgent need for scalable, portable, and efficient programming tools by presenting a novel run-time system and programming model built upon the FastFlow framework. Designed to support both scale-up and scale-out paradigms, the proposed approach enables developers to efficiently exploit heterogeneous hardware platforms without sacrificing programmability or requiring significant code refactoring. A key innovation is the introduction of distributed groups dgroups, logical subdivisions of FastFlow building blocks that maintain business logic while facilitating flexible computation distribution. To enhance adaptability across diverse environments beyond traditional HPC clusters, the Multi-Transport Communication Library (MTCL) is introduced, offering a unified API for multiple transport protocols. The system’s effectiveness is validated through benchmarks and a set of real-world applications, from decentralized machine learning scenarios to scalable bioinformatics pipelines deployed in distributed environments.| File | Dimensione | Formato | |
|---|---|---|---|
|
PhD_Thesis_NicoloTonci.pdf
accesso aperto
Licenza:
Creative Commons
Dimensione
8.41 MB
Formato
Adobe PDF
|
8.41 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/307959
URN:NBN:IT:UNIPI-307959