Tensor-tensor algebra for optimal representation and compression of multiway data

With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the National Academy of Sciences - PNAS Vol. 118; no. 28; pp. 1 - 12
Main Authors	Kilmer, Misha E., Horesh, Lior, Avron, Haim, Newman, Elizabeth
Format	Journal Article
Language	English
Published	United States National Academy of Sciences 13.07.2021
Subjects	Applied Mathematics Compressibility Compression Datasets Distillation Feature extraction Learning algorithms Machine learning Mathematical analysis Matrices (mathematics) Matrix algebra Optimization Physical Sciences Representations Singular value decomposition Tensors SVD rank tensor compression multiway data
Online Access	Get full text
ISSN	0027-8424 1091-6490 1091-6490
DOI	10.1073/pnas.2015851118

Cover

More Information
Summary:	With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Edited by David L. Donoho, Stanford University, Stanford, CA, and approved February 8, 2021 (received for review July 29, 2020) Author contributions: M.E.K., L.H., H.A., and E.N. designed research, performed research, analyzed data, and wrote the paper.
ISSN:	0027-8424 1091-6490 1091-6490
DOI:	10.1073/pnas.2015851118