DataSheet15_Comparison of RNA-Seq and microarray in the prediction of protein expression and survival prediction.PDF
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet15_Comparison_of_RNA-Seq_and_microarray_in_the_prediction_of_protein_expression_and_survival_prediction_PDF/25271572
下载链接
链接失效反馈官方服务:
资源简介:
Gene expression profiling using RNA-sequencing (RNA-seq) and microarray technologies is widely used in cancer research to identify biomarkers for clinical endpoint prediction. We compared the performance of these two methods in predicting protein expression and clinical endpoints using The Cancer Genome Atlas (TCGA) datasets of lung cancer, colorectal cancer, renal cancer, breast cancer, endometrial cancer, and ovarian cancer. We calculated the correlation coefficients between gene expression measured by RNA-seq or microarray and protein expression measured by reverse phase protein array (RPPA). In addition, after selecting the top 103 survival-related genes, we compared the random forest survival prediction model performance across test platforms and cancer types. Both RNA-seq and microarray data were retrieved from TCGA dataset. Most genes showed similar correlation coefficients between RNA-seq and microarray, but 16 genes exhibited significant differences between the two methods. The BAX gene was recurrently found in colorectal cancer, renal cancer, and ovarian cancer, and the PIK3CA gene belonged to renal cancer and breast cancer. Furthermore, the survival prediction model using microarray was better than the RNA-seq model in colorectal cancer, renal cancer, and lung cancer, but the RNA-seq model was better in ovarian and endometrial cancer. Our results showed good correlation between mRNA levels and protein measured by RPPA. While RNA-seq and microarray performance were similar, some genes showed differences, and further clinical significance should be evaluated. Additionally, our survival prediction model results were controversial.
采用RNA测序(RNA-sequencing, RNA-seq)与微阵列技术开展的基因表达谱分析,在癌症研究领域被广泛用于识别临床终点预测相关的生物标志物。本研究依托癌症基因组图谱(The Cancer Genome Atlas, TCGA)中肺癌、结直肠癌、肾癌、乳腺癌、子宫内膜癌及卵巢癌的数据集,对比了这两种技术在预测蛋白质表达与临床终点方面的性能表现。本研究计算了通过RNA-seq或微阵列测得的基因表达,与通过反相蛋白阵列(reverse phase protein array, RPPA)测得的蛋白质表达之间的相关系数。此外,在筛选出前103个与生存相关的基因后,本研究对比了不同检测平台及癌种下随机森林生存预测模型的性能表现。本研究所用的RNA-seq与微阵列数据均来自TCGA数据集。多数基因的RNA-seq与微阵列检测结果相关系数相近,但有16个基因在两种技术间表现出显著差异。BAX基因在结直肠癌、肾癌及卵巢癌中反复被检出,而PIK3CA基因则在肾癌与乳腺癌中被检出。进一步分析显示,在结直肠癌、肾癌及肺癌中,基于微阵列数据的生存预测模型性能优于RNA-seq模型;而在卵巢癌与子宫内膜癌中,RNA-seq模型的表现更优。本研究结果显示,mRNA水平与RPPA检测得到的蛋白质水平之间存在良好的相关性。尽管RNA-seq与微阵列技术的整体性能相近,但部分基因的检测结果存在差异,其临床意义仍需进一步评估。此外,本研究的生存预测模型结果存在一定争议。
创建时间:
2024-02-23



