five

Topological Identification and Interpretation for Single-cell Epigenetic Regulation Elucidation in Multi-tasks using scAGDE

收藏
DataCite Commons2025-04-01 更新2025-09-08 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Topological_Identification_and_Interpretation_for_Single-cell_Epigenetic_Regulation_Elucidation_in_Multi-tasks_using_scAGDE/26067511/1
下载链接
链接失效反馈
官方服务:
资源简介:
We generated a total of 17 simulated datasets from bulk ATAC-seq data of bone marrow, which contains six FACS-sorted cell populations. Following a previously published benchmarking framework for scATAC-seq tools, we set the parameter n, which determines fragment counts within a single cell, at 250, 500, 1500, 2500, and 5000, respectively, thus obtaining five datasets of varying sequencing depth. We set the parameter q, which controls the proportion of cell-specific reads at 0, 0.1, 0.2, 0.3, 0.4, thus obtaining five datasets of differing noise levels. Lastly, we randomly dropped valid reads at rates ranging from 10% to 70%, generating seven datasets with a varying degree of dropout. Additionally, we collected 11 publicly available scATAC-seq datasets with given cell type labels for benchmarking to validate the effectiveness of scAGDE. These datasets, generated from different platforms and including human and mouse samples, vary in sparsity and scalability. Four datasets annotated through computational approaches included ‘Forebrain’ (GSE100033),‘Splenocyte’ (E-MTAB-6714), ‘GM12878vsHEK’ (GSE65360), ‘GM12878vsHL’ (GSE149683), ‘Lung’ (GSE149683) and‘Liver’ (GSE65360). Three datasets containing FACS-sorted cell populations were ‘Blood2K’ (GSE96772), ‘10XBlood’(GSE129785), and ‘DropBlood’ (GSE123581). The remaining two datasets were ‘Leukemia’ (GSE74310), which mixes cells from a healthy donor with leukemia cells from two acute myeloid leukemia (AML) patients, and ‘InSilico’ (GSE65360) combining six individual scATAC-seq data from distinct cell lines. The human fetal atlas dataset from Domcke et al., can be obtained from the public resource at https://descartes.brotmanbaty.org/bbi/human-chromatin-during-development/. The human brain dataset, downloadable from GSE184462, comes from a single-cell atlas of scATAC-seq of the human genome. The reference single-cell RNA-seq dataset from brain tissue used in our study can be found at GSE207334 and utilizes data from human samples. All processed datasets for benchmarking analysis and the human brain dataset have been deposited in the Zenodo database at https://doi.org/10.5281/zenodo.11609252.
提供机构:
figshare
创建时间:
2025-02-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作