five

A transcriptome-based deep neural network classifier for identifying the site of origin in mucinous cancer

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE163126
下载链接
链接失效反馈
官方服务:
资源简介:
Background There is a lack of tools for identifying the site of origin in mucinous cancer. This study aimed to evaluate the performance of a transcriptome-based classifier for identifying the site of origin in mucinous cancer. Methods Transcriptomic data of 1,878 non-mucinous and 82 mucinous cancer specimens, with 7 sites of origin, namely, the uterine cervix (CESC), colon (COAD), pancreas (PAAD), stomach (STAD), uterine endometrium (UCEC), uterine carcinosarcoma (UCS), and ovary (OV), obtained from The Cancer Genome Atlas, were used as the training and validation sets, respectively. Transcriptomic data of 14 mucinous cancer specimens from a tissue archive were used as the test set. For identifying the site of origin, a set of 100 differentially expressed genes for each site of origin was selected. After removing multiple iterations of the same gene, 626 genes were chosen, and their RNA expression profiles, at each site of origin, were used to train the deep neural network classifier. The performance of the classifier was estimated using the training, validation, and test sets. Results The accuracy of the model in the training set was 0.977, while that in the validation set was 0.939 (77/82). In the test set, the model showed an accuracy of 0.857 (12/14). t-SNE analysis revealed that samples in the test set were part of the clusters obtained for the training set. Conclusion Although limited by small sample size, we showed that a transcriptome-based classifier could correctly identify the site of origin of mucinous cancer. Train-validate a classifier using publicly available database. Estimate the performance of the classifier in a test set of archival tissues. This submission is the test set of 14 samples, but raw data is not available for the 8 CAM samples because the files were deleted. FPKMs are available for all 14 samples.
创建时间:
2020-12-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作