pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods
Background Variability in datasets is not only the product of biological processes: they are also the product of technical biases. ComBat and ComBat-Seq are among the most widely used tools for correcting those technical biases, called batch effects, in, respectively, microarray and RNA-Seq expressi...
        Saved in:
      
    
          | Published in | BMC bioinformatics Vol. 24; no. 1; pp. 459 - 9 | 
|---|---|
| Main Authors | , , , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        London
          BioMed Central
    
        07.12.2023
     BioMed Central Ltd Springer Nature B.V BMC  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1471-2105 1471-2105  | 
| DOI | 10.1186/s12859-023-05578-5 | 
Cover
| Summary: | Background
Variability in datasets is not only the product of biological processes: they are also the product of technical biases. ComBat and ComBat-Seq are among the most widely used tools for correcting those technical biases, called batch effects, in, respectively, microarray and RNA-Seq expression data.
Results
In this technical note, we present a new Python implementation of ComBat and ComBat-Seq. While the mathematical framework is strictly the same, we show here that our implementations: (i) have similar results in terms of batch effects correction; (ii) are as fast or faster than the original implementations in R and; (iii) offer new tools for the bioinformatics community to participate in its development. pyComBat is implemented in the Python language and is distributed under GPL-3.0 (
https://www.gnu.org/licenses/gpl-3.0.en.html
) license as a module of the inmoose package. Source code is available at
https://github.com/epigenelabs/inmoose
and Python package at
https://pypi.org/project/inmoose
.
Conclusions
We present a new Python implementation of state-of-the-art tools ComBat and ComBat-Seq for the correction of batch effects in microarray and RNA-Seq data. This new implementation, based on the same mathematical frameworks as ComBat and ComBat-Seq, offers similar power for batch effect correction, at reduced computational cost. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23  | 
| ISSN: | 1471-2105 1471-2105  | 
| DOI: | 10.1186/s12859-023-05578-5 |