five

Synthetic bulk RNA-Seq transcriptomic profiles representing 10 Cancer hallmarks

收藏
DataONE2025-01-09 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:aeb0b1b957ae5bd64c5d51b73be04af17066f3d637233d4744f0ea18bd09a177
下载链接
链接失效反馈
官方服务:
资源简介:
Evidence before this study   We conducted an extensive literature search using Google Scholar without language restrictions, employing search terms such as “(Predicting OR Classifying OR Annotating) and (cancer hallmarks) AND (Deep OR Machine Learning) OR (Artificial Intelligence OR AI).” Despite notable advances in molecular oncology and computational methodologies, a critical gap remains: no existing machine learning or deep learning framework comprehensively predicts cancer hallmarks from tumor biopsy samples. Current research primarily targets specific molecular pathways associated with individual hallmarks, leaving clinicians without an integrated model to interpret hallmark activity at the level of an individual tumor. Moreover, the absence of wet-lab techniques capable of annotating all cancer hallmarks in biopsy samples has further impeded progress, limiting the clinical utility of hallmark-related insights for precision oncology.   Added value of this study   This study introdu..., Dataset Collection and Processing   We utilized a large-scale dataset comprising 2.7 million single-cell transcriptomes derived from 14 tumor types, collected from 922 patients across 51 independent studies conducted globally. This dataset was sourced from the Weizmann Institute's 3CA repository. Quality Control   Before generating synthetic datasets for model training, the raw single-cell transcriptomic data underwent a rigorous quality control (QC) process. Cells with over 15% mitochondrial transcript content, fewer than 200, or more than 6,000 expressed mRNA transcripts were excluded to ensure data reliability.   Gene Set Curation   Gene sets representing cancer hallmarks were compiled from multiple databases, retaining only genes identified in at least two independent sources. This selection was refined through manual literature reviews to exclude genes without direct or indirect roles in hallmark-related pathways.   Digital Scoring   Using the curated gene sets, Digital Scores were..., , # Synthetic bulk RNA-Seq transcriptomic profiles representing 10 Cancer hallmarks [https://doi.org/10.5061/dryad.zw3r228jc](https://doi.org/10.5061/dryad.zw3r228jc) ## Description of the data and file structure ### Data Description: Experimental Efforts This dataset comprises single-cell transcriptomic data from the Weizmann 3CA repository, encompassing 2.7 million single-cell transcriptomes from 14 tumor types, collected from 922 patients across 51 global studies. The primary objective of the experimental efforts was to generate synthetic datasets for training and validating computational models to identify and analyze cancer hallmarks at the single-cell resolution. Single-cell RNA sequencing (scRNA-seq) data underwent a rigorous quality control process to ensure reliability and biological relevance. This included exclusion criteria based on mitochondrial transcript content (>15%) and mRNA transcript counts (<200 or >6,000 transcripts). Gene sets corresponding to 10 estab...
创建时间:
2025-01-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作