Biomarker Benchmark - GSE37745

Figshare2016-03-17 更新2026-04-29 收录

下载链接：

https://figshare.com/articles/dataset/GSE37745/2069707

下载链接

链接失效反馈

官方服务：

资源简介：

[NOTICE: This data set has been deprecated. Please see our new version of the data (and additional data sets) here: https://osf.io/mhk93 ]" Background: Global gene expression profiling has been widely used in lung cancer research to identify clinically relevant molecular subtypes as well as to predict prognosis and therapy response. So far, the value of these multi-gene signatures in clinical practice is unclear and the biological importance of individual genes is difficult to assess as the published signatures virtually do not overlap.Methods: Here we describe a novel single institute cohort, including 196 non-small lung cancer (NSCLC) cases with clinical information and long-term follow-up, which was used as a training set to screen for single genes with prognostic impact. The top 450 gene probe sets identified using a univariate Cox regression model (significance level pResults: The meta-analysis revealed that 17 probe sets were significantly associated with survival (pConclusions: We were able to validate single genes with independent prognostic impact using a novel NSCLC cohort together with a meta-analysis approach. CADM1 was identified as an immunohistochemical marker with a potential application in clinical diagnostics."http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37745We have included gene-expression data, the outcome (class) being predicted, and any clinical covariates. When gene-expression data were processed in multiple batches, we have provided batch information. Each data set is organized into a file set, where each contains all pertinent files for an individual dataset. The gene expression files have been normalized using both the SCAN and UPC methods using the SCAN.UPC package in Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html). We summarized the data at the gene level using the BrainArray resource (http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/20.0.0/ensg.asp). We used Ensembl identifiers. The class, clinical, and batch data were hand curated to ensure consistency ("tidy data" formatting). In addition, the data files have been formatted to be imported easily into the ML-Flex machine learning package (http://mlflex.sourceforge.net/).

创建时间：

2016-03-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集