Bulk brain tissue cell type deconvolution with bias correction for single‐nuclei RNA‐seq

Background Quantifying cell type percentages from bulk brain RNA‐sequencing enables researchers to better understand the components underlying disease pathogenesis. Despite being designed for single‐cell RNA‐sequencing (scRNA‐seq) data, MuSiC deconvolution algorithm can use single‐nuclei RNA‐sequenc...

Full description

Saved in:
Bibliographic Details
Published inAlzheimer's & dementia Vol. 18; no. S3
Main Authors O'Neill, Nicholas K, Hu, Junming, Stein, Thor D., Zhang, Xiaoling, Farrer, Lindsay A.
Format Journal Article
LanguageEnglish
Published 01.12.2022
Online AccessGet full text
ISSN1552-5260
1552-5279
DOI10.1002/alz.065942

Cover

More Information
Summary:Background Quantifying cell type percentages from bulk brain RNA‐sequencing enables researchers to better understand the components underlying disease pathogenesis. Despite being designed for single‐cell RNA‐sequencing (scRNA‐seq) data, MuSiC deconvolution algorithm can use single‐nuclei RNA‐sequencing (snRNA‐seq) data generated from brain tissue to estimate cell type proportions in bulk brain RNA‐sequencing data but does not fully compensate for sequencing differences between bulk and snRNA‐seq data. We modified MuSiC's gene weighing scheme to compensate for this sequencing bias. Methods MuSiC calculates gene weight each iteration using the residual from the previous iteration, gene variation among subjects, and other factors. We calculated the RNA capture rate difference between genes in single‐nuclei and bulk sequencing data and reduced MuSiC’s weight for genes with strong differences. We compared the accuracy of deconvoluted data from MuSiC and our modified algorithm (mMuSiC) by simulating bulk data with seven brain cell types and calculating the concordance correlation coefficient (CCC) between true and estimated cell type percentages. The accuracy of the original and modified deconvolution algorithms was also assessed using human brain dorsolateral prefrontal cortex (DLPFC) bulk RNA‐seq data sets from ROSMAP with subject‐matched immunohistochemistry (IHC) measurements for 69 samples and bulk RNA‐seq from the Framingham Heart Study/Boston University Alzheimer Disease Research Center with subject‐matched microglial (IBA1+) cell density measurements for 163 samples from the same brain region. Results mMuSiC improves the concordance correlation coefficients (CCC) between estimated and true cell fractions in our four simulations for each cell type with a p‐value of 0.014. This improvement is especially pronounced for both inhibitory and excitatory neurons, with an average CCC of 0.45 for mMuSiC and 0.22 for MuSiC. In human brain DLPFC bulk RNA‐seq data, our method also improves the CCC between cell fraction estimates and IHC measurements for each cell type tested in ROSMAP, with mMuSiC averaging 0.14 and MuSiC averaging 0.10. The correlation between microglia cell fraction estimates and IBA1+ cell density measurements is also improved in mMuSiC (R=0.33, p=1.5e‐5) over MuSiC (R=0.12, p=0.11). Conclusion mMuSiC improves cell fraction estimates of bulk brain RNAseq datain studies using snRNA‐seq. This is particularly useful for brain research where snRNA‐seq is unavailable.
ISSN:1552-5260
1552-5279
DOI:10.1002/alz.065942