GSE4271
收藏DataCite Commons2020-09-04 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/GSE4271/2069692/2
下载链接
链接失效反馈官方服务:
资源简介:
We have included gene-expression data, the outcome (class) being predicted, and any clinical covariates. When gene-expression data were processed in multiple batches, we have provided batch information. Each data set is organized into a file set, where each contains all pertinent files for an individual dataset. The gene expression files have been normalized using both the SCAN and UPC methods using the SCAN.UPC package in Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html). We summarized the data at the gene level using the BrainArray resource (http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/20.0.0/ensg.asp). We used Ensembl identifiers. The class, clinical, and batch data were hand curated to ensure consistency ("tidy data" formatting). In addition, the data files have been formatted to be imported easily into the ML-Flex machine learning package (http://mlflex.sourceforge.net/).
本数据集包含基因表达数据、待预测的结局(类别)以及各类临床协变量。当基因表达数据经多批次处理时,本数据集同时提供批次相关信息。每个数据集均整理为一个文件组,其中包含对应独立数据集的全部相关文件。基因表达文件已通过Bioconductor中的SCAN.UPC包,采用SCAN与UPC两种方法完成标准化处理,相关访问链接为:https://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html。本数据集依托BrainArray资源(http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/20.0.0/ensg.asp)完成基因层面的数据汇总,并采用Ensembl 标识符(Ensembl identifiers)进行基因标注。结局、临床及批次数据均经过人工整理校验,以确保格式统一符合“整洁数据(tidy data)”规范。此外,所有数据文件均已优化格式,可便捷导入至ML-Flex机器学习包(http://mlflex.sourceforge.net/)中。
提供机构:
figshare
创建时间:
2016-02-04



