New algorithm for tensor contractions on multi‐core CPUs, GPUs, and accelerators enables CCSD and EOM‐CCSD calculations with over 1000 basis functions on a single compute node
A new hardware‐agnostic contraction algorithm for tensors of arbitrary symmetry and sparsity is presented. The algorithm is implemented as a stand‐alone open‐source code libxm. This code is also integrated with general tensor library libtensor and with the Q‐Chem quantum‐chemistry package. An overvi...
Saved in:
| Published in | Journal of computational chemistry Vol. 38; no. 11; pp. 842 - 853 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
United States
Wiley Subscription Services, Inc
30.04.2017
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0192-8651 1096-987X 1096-987X |
| DOI | 10.1002/jcc.24713 |
Cover
| Summary: | A new hardware‐agnostic contraction algorithm for tensors of arbitrary symmetry and sparsity is presented. The algorithm is implemented as a stand‐alone open‐source code libxm. This code is also integrated with general tensor library libtensor and with the Q‐Chem quantum‐chemistry package. An overview of the algorithm, its implementation, and benchmarks are presented. Similarly to other tensor software, the algorithm exploits efficient matrix multiplication libraries and assumes that tensors are stored in a block‐tensor form. The distinguishing features of the algorithm are: (i) efficient repackaging of the individual blocks into large matrices and back, which affords efficient graphics processing unit (GPU)‐enabled calculations without modifications of higher‐level codes; (ii) fully asynchronous data transfer between disk storage and fast memory. The algorithm enables canonical all‐electron coupled‐cluster and equation‐of‐motion coupled‐cluster calculations with single and double substitutions (CCSD and EOM‐CCSD) with over 1000 basis functions on a single quad‐GPU machine. We show that the algorithm exhibits predicted theoretical scaling for canonical CCSD calculations, O(N6), irrespective of the data size on disk. © 2017 Wiley Periodicals, Inc.
An algorithm for tensor contractions and its open‐source implementation are presented. The contraction algorithm is hardware‐agnostic and works on multi‐core CPUs, GPUs, and oating‐point accelerators. The code works efficiently with tensors of arbitrary symmetry and sparsity of up to tens of terabytes in size. Its utility is demonstrated in electronic structure calculations. The algorithm enables canonical all‐electron CCSD and EOM‐CCSD calculations with over 1000 basis functions on a single quad‐GPU machine. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-News-1 content type line 23 |
| ISSN: | 0192-8651 1096-987X 1096-987X |
| DOI: | 10.1002/jcc.24713 |