Benchmarking principal component analysis for large-scale single-cell RNA-sequencing

Background Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory. Results In this work, we review the existing fast and memory-efficient P...

Full description

Saved in:
Bibliographic Details
Published inGenome Biology Vol. 21; no. 1; p. 9
Main Authors Tsuyuzaki, Koki, Sato, Hiroyuki, Sato, Kenta, Nikaido, Itoshi
Format Journal Article
LanguageEnglish
Published London BioMed Central 20.01.2020
Springer Nature B.V
BMC
Subjects
Online AccessGet full text
ISSN1474-760X
1474-7596
1474-760X
DOI10.1186/s13059-019-1900-3

Cover

More Information
Summary:Background Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory. Results In this work, we review the existing fast and memory-efficient PCA algorithms and implementations and evaluate their practical application to large-scale scRNA-seq datasets. Our benchmark shows that some PCA algorithms based on Krylov subspace and randomized singular value decomposition are fast, memory-efficient, and more accurate than the other algorithms. Conclusion We develop a guideline to select an appropriate PCA implementation based on the differences in the computational environment of users and developers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1474-760X
1474-7596
1474-760X
DOI:10.1186/s13059-019-1900-3