Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles

Name: Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles
Creator: USDA-ARS
License: 暂无描述

Mendeley Data2026-04-09 收录

下载链接：

https://data.mendeley.com/datasets/5xrfg5dym6

下载链接

链接失效反馈

官方服务：

资源简介：

Batch variation is unwanted variation that plagues syntheses of microbiome sequence data. Batch effects correction algorithms (BECAs) aim to remove batch effects, but most BECAs do not account for a common problem whereby batch covariates of interest are imbalanced (e.g., when classes do not appear in all batches or in even sample proportions). Here we tested five BECAs on eight seed microbiome studies which are prone to severe batch effects due to variable seed handling practices. We compared the performance of BECAs including zero-mean centering (ZMC), Ratio-A, ConQuR, PLSDA, and wPLSDA (developed for imbalanced batch-covariates). We also account for the sparsity and compositionality of microbiome data with zero imputation and center log ratio transformation (CLR). We found 1) using a redundancy analysis, that no method reduced variation explained by the unwanted covariate to zero; 2) ConQuR, Ratio-A, and ZMC removed the magnitude of batch effects per a guided principal component analysis which quantifies the magnitude of batch effects (δ = 0, p<0.001); and 3) CLR and zero imputation improved the removal of batch effects and variance explained by the wanted variable by ZMC. These results call for careful application of BECASs and indicate that ZMC, Ratio-A, ConQuR provide some improvements in remediating batch effects in batch-covariate imbalanced data. Continued development of BECAs is urgently required for successful use for batch corrections in this use case.

提供机构：

USDA-ARS

5,000+

优质数据集

54 个

任务类型

进入经典数据集