Increasing on-chip wire delay and growing off-chip miss latency, present two key challenges in designing large Level-2 (L2) CMP caches. Currently, some CMPs use a shared L2 cache to maximize cache capacity and minimize off-chip misses. Others use private L2 caches, replicating data to limit the delay from slow on-chip wires and minimize cache access time. Ideally, to improve performance for a wide variety of workloads, CMPs prefer both the capacity of a shared cache and the access latency of private caches. In this context, NUCA caches have been proved to be able to tolerate wire delay effects while maintaining a huge on-chip storage capacity. In this thesis, we investigate the choice of the coherence strategy (MESI and MOESI) and the whole system topology as design tradeoffs for S-NUCA based CMP system, and propose and evaluate a novel block migration scheme for DNUCA based systems, in which are addressed two specific problems that can arise due to the presence of multiple traffic sources. Results show that, in S-NUCA based CMP systems, choosing between MESI and MOESI has not a significant impact on performance, while the system topology can lead to very different behaviors. Block migration is introduced in NUCA cache to reduce access latency in a shared cache. Our results show that the migration mechanism is effective in reducing the average L1 miss latency, but the impact on performance is smaller, as a consequence of the very little L1 miss rate.

Cache Architectures for Wire-Delay Dominated CMP Systems

SOLINAS, MARCO
2009

Abstract

Increasing on-chip wire delay and growing off-chip miss latency, present two key challenges in designing large Level-2 (L2) CMP caches. Currently, some CMPs use a shared L2 cache to maximize cache capacity and minimize off-chip misses. Others use private L2 caches, replicating data to limit the delay from slow on-chip wires and minimize cache access time. Ideally, to improve performance for a wide variety of workloads, CMPs prefer both the capacity of a shared cache and the access latency of private caches. In this context, NUCA caches have been proved to be able to tolerate wire delay effects while maintaining a huge on-chip storage capacity. In this thesis, we investigate the choice of the coherence strategy (MESI and MOESI) and the whole system topology as design tradeoffs for S-NUCA based CMP system, and propose and evaluate a novel block migration scheme for DNUCA based systems, in which are addressed two specific problems that can arise due to the presence of multiple traffic sources. Results show that, in S-NUCA based CMP systems, choosing between MESI and MOESI has not a significant impact on performance, while the system topology can lead to very different behaviors. Block migration is introduced in NUCA cache to reduce access latency in a shared cache. Our results show that the migration mechanism is effective in reducing the average L1 miss latency, but the impact on performance is smaller, as a consequence of the very little L1 miss rate.
18-apr-2009
Italiano
cache memory
CMP
coherence protocol
NUCA
wire-delay
Dini, Gianluca
Foglia, Pierfrancesco
Prete, Cosimo Antonio
File in questo prodotto:
File Dimensione Formato  
TesiDottoratoMarcoSolinas.pdf

embargo fino al 29/05/2049

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati
Dimensione 3.59 MB
Formato Adobe PDF
3.59 MB Adobe PDF
CopertinaTesiMarcoSolinas.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 34.39 kB
Formato Adobe PDF
34.39 kB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/132926
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-132926