five

Preserving biological heterogeneity with personalized genomics batch correction

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE53355
下载链接
链接失效反馈
官方服务:
资源简介:
Motivation: Sample source, procurement process, and other technical variations introduce batch effects into genomics data. Algorithms to remove these artifacts enhance differences between known biological covariates, but also carry potential concern of removing intra-group biological heterogeneity and thus any personalized genomic signatures. As a result, accurate identification of novel subtypes from batch corrected genomics data is challenging using standard algorithms designed to remove batch effects for class comparison analyses. Nor can batch effects be corrected reliably in future applications of genomics-based clinical tests, in which the biological groups are by definition unknown a priori. Results: Therefore, we introduce new algorithm, personalized-SVA (pSVA), blind to biological covariates corrected technical artifacts while retaining biological heterogeneity in genomic data. This algorithm facilitated accurate subtype identification in head and neck cancer from gene expression data in both formalin fixed and frozen samples. When applied to predict HPV status, pSVA improved cross- study validation even if the sample batches were highly confounded with HPV status in the training set. Availability: All analyses were performed using R version 2.15.0. The code and data used to generate the results of this manuscript is available from https://sourceforge.net/projects/psva. 44 Head and Neck Squamous Cell Carcinoma (HNSCC) primary tumors that were either flash frozen (29) or preserved with FFPE (15), with 4 from both. Gene expression data from 38 of these samples was also measured using different amplification kits, and are available in GSE3292 and GSE10300. Data is log2 expression, obtained from fRMA normalization.
创建时间:
2019-03-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作