The MOMMS Family of Matrix Multiplication Algorithms
As the ratio between the rate of computation and rate with which data can be retrieved from various layers of memory continues to deteriorate, a question arises: Will the current best algorithms for computing matrix-matrix multiplication on future CPUs continue to be (near) optimal? This paper provi...
Saved in:
| Main Authors | , |
|---|---|
| Format | Journal Article |
| Language | English |
| Published |
11.04.2019
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.48550/arxiv.1904.05717 |
Cover
| Summary: | As the ratio between the rate of computation and rate with which data can be
retrieved from various layers of memory continues to deteriorate, a question
arises: Will the current best algorithms for computing matrix-matrix
multiplication on future CPUs continue to be (near) optimal? This paper
provides compelling analytical and empirical evidence that the answer is "no".
The analytical results guide us to a new family of algorithms of which the
current state-of-the-art "Goto's algorithm" is but one member. The empirical
results, on architectures that were custom built to reduce the amount of
bandwidth to main memory, show that under different circumstances, different
and particular members of the family become more superior. Thus, this family
will likely start playing a prominent role going forward. |
|---|---|
| DOI: | 10.48550/arxiv.1904.05717 |