Scalability Evaluation of a Polymorphic Register File: A CG Case Study

We evaluate the scalability of a Polymorphic Register File using the Conjugate Gradient method as a case study. We focus on a heterogeneous multi-processor architecture, taking into consideration critical parameters such as cache bandwidth and memory latency. We compare the performance of 256 Polymo...

Full description

Saved in:
Bibliographic Details
Published inArchitecture of Computing Systems - ARCS 2011 pp. 13 - 25
Main Authors Ciobanu, Cătălin B., Martorell, Xavier, Kuzmanov, Georgi K., Ramirez, Alex, Gaydadjiev, Georgi N.
Format Book Chapter Publication
LanguageEnglish
Published Berlin, Heidelberg Springer Berlin Heidelberg 2011
Springer
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN3642191363
9783642191367
ISSN0302-9743
1611-3349
DOI10.1007/978-3-642-19137-4_2

Cover

More Information
Summary:We evaluate the scalability of a Polymorphic Register File using the Conjugate Gradient method as a case study. We focus on a heterogeneous multi-processor architecture, taking into consideration critical parameters such as cache bandwidth and memory latency. We compare the performance of 256 Polymorphic Register File-augmented workers against a single Cell PowerPC Processor Unit (PPU). In such a scenario, simulation results suggest that for the Sparse Matrix Vector Multiplication kernel, absolute speedups of up to 200 times can be obtained. Moreover, when equal number of workers in the range 1-256 is employed, our design is between 1.7 and 4.2 times faster than a Cell PPU-based system. Furthermore, we study the memory latency and cache bandwidth impact on the sustainable speedups of the system considered. Our tests suggest that a 128 worker configuration requires the caches to deliver 1638.4 GB/sec in order to preserve 80% of its peak speedup.
ISBN:3642191363
9783642191367
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-642-19137-4_2