Benchmarking principal component analysis for large-scale single-cell RNA-sequencing
Background Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory. Results In this work, we review the existing fast and memory-efficient P...
Saved in:
| Published in | Genome Biology Vol. 21; no. 1; p. 9 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
London
BioMed Central
20.01.2020
Springer Nature B.V BMC |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1474-760X 1474-7596 1474-760X |
| DOI | 10.1186/s13059-019-1900-3 |
Cover
| Summary: | Background
Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory.
Results
In this work, we review the existing fast and memory-efficient PCA algorithms and implementations and evaluate their practical application to large-scale scRNA-seq datasets. Our benchmark shows that some PCA algorithms based on Krylov subspace and randomized singular value decomposition are fast, memory-efficient, and more accurate than the other algorithms.
Conclusion
We develop a guideline to select an appropriate PCA implementation based on the differences in the computational environment of users and developers. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1474-760X 1474-7596 1474-760X |
| DOI: | 10.1186/s13059-019-1900-3 |