five

Bulk-RNA Sequencing of high-grade pancreatic and non-pancreatic Neuroendocrine Neoplasms

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.omicsdi.org/dataset/ega/EGAS00001004861
下载链接
链接失效反馈
官方服务:
资源简介:
Therapeutic decisions in oncology depend on a precise pathological classification of individual neoplasms. Recent years have seen an intensification of research activities aimed at the extraction of clinically relevant information from patient-derived 'omics' data based on Machine-Learning models. However, a comprehensive training of Machine-Learning models requires sufficiently large numbers of training samples, which are usually not available for rare cancer types. The problem is worsened when individual tissues segregate into different cancer subtypes, as their discrimination would require even more training samples. Methods: Here, we report on a new data-augmentation technique to support the training of Machine-Learning models on ‘omics’ data from pancreatic neuroendocrine neoplasms (panNEN). PanNENs display all properties described above: Only about 2-3% of all pancreatic neoplasms are neuroendocrine and they fall into different subtypes with distinctly different prognosis, which makes the precise classification of such samples both difficult and important for therapy decisions. The approach reconstructs a given transcriptome based on healthy pancreatic cell type signatures and creates a Machine-Learning model that integrates the observed reconstruction error and predicted cell type proportions as features. Results: A benchmark of the deconvolution model predictions with the ground-truth found that the model could efficiently predict the sample grading, disease-related patient survival time, and differentiate between different subtypes in four panNEN and one mixed panNEN and non-pancreatic NENs dataset. We compared the predictive performance of the deconvolution-trained model to that of a model trained directly on the transcriptomic data, under inclusion of the Ki-67 classification gold standard biomarker, and found the performances to be comparable. Conclusion: Our approach serves as a data-augmentation technique to facilitate the training of Machine Learning models for rare cancer types via utilization of data from healthy origin. Additionally, the deconvolution results were clinically interpretable and further research can render the deconvolution approach an effective complementary asset for the clinical classification of neoplasms.EGA study EGAS00001004861
创建时间:
2020-12-04
二维码
社区交流群
二维码
科研交流群
商业服务