five

Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles

收藏
Mendeley Data2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/5xrfg5dym6
下载链接
链接失效反馈
官方服务:
资源简介:
Batch variation is unwanted variation that plagues syntheses of microbiome sequence data. Batch effects correction algorithms (BECAs) aim to remove batch effects, but most BECAs do not account for a common problem whereby batch covariates of interest are imbalanced (e.g., when classes do not appear in all batches or in even sample proportions). Here we tested five BECAs on eight seed microbiome studies which are prone to severe batch effects due to variable seed handling practices. We compared the performance of BECAs including zero-mean centering (ZMC), Ratio-A, ConQuR, PLSDA, and wPLSDA (developed for imbalanced batch-covariates). We also account for the sparsity and compositionality of microbiome data with zero imputation and center log ratio transformation (CLR). We found 1) using a redundancy analysis, that no method reduced variation explained by the unwanted covariate to zero; 2) ConQuR, Ratio-A, and ZMC removed the magnitude of batch effects per a guided principal component analysis which quantifies the magnitude of batch effects (δ = 0, p<0.001); and 3) CLR and zero imputation improved the removal of batch effects and variance explained by the wanted variable by ZMC. These results call for careful application of BECASs and indicate that ZMC, Ratio-A, ConQuR provide some improvements in remediating batch effects in batch-covariate imbalanced data. Continued development of BECAs is urgently required for successful use for batch corrections in this use case.
提供机构:
USDA-ARS
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作