five

Table2_Completing Single-Cell DNA Methylome Profiles via Transfer Learning Together With KL-Divergence.XLSX

收藏
frontiersin.figshare.com2023-06-16 更新2025-01-21 收录
下载链接:
https://frontiersin.figshare.com/articles/dataset/Table2_Completing_Single-Cell_DNA_Methylome_Profiles_via_Transfer_Learning_Together_With_KL-Divergence_XLSX/20354385/1
下载链接
链接失效反馈
官方服务:
资源简介:
The high level of sparsity in methylome profiles obtained using whole-genome bisulfite sequencing in the case of low biological material amount limits its value in the study of systems in which large samples are difficult to assemble, such as mammalian preimplantation embryonic development. The recently developed computational methods for addressing the sparsity by imputing missing have their limits when the required minimum data coverage or profiles of the same tissue in other modalities are not available. In this study, we explored the use of transfer learning together with Kullback-Leibler (KL) divergence to train predictive models for completing methylome profiles with very low coverage (below 2%). Transfer learning was used to leverage less sparse profiles that are typically available for different tissues for the same species, while KL divergence was employed to maximize the usage of information carried in the input data. A deep neural network was adopted to extract both DNA sequence and local methylation patterns for imputation. Our study of training models for completing methylome profiles of bovine oocytes and early embryos demonstrates the effectiveness of transfer learning and KL divergence, with individual increase of 29.98 and 29.43%, respectively, in prediction performance and 38.70% increase when the two were used together. The drastically increased data coverage (43.80–73.6%) after imputation powers downstream analyses involving methylomes that cannot be effectively done using the very low coverage profiles (0.06–1.47%) before imputation.

在利用全基因组亚硫酸氢盐测序技术在生物材料量较少的情况下获取的甲基化组轮廓中,其高度的稀疏性限制了其在研究难以组装大量样本的系统中的价值,如哺乳动物早期胚胎发育。近期开发的通过插补缺失数据来解决稀疏性的计算方法,在所需的最小数据覆盖或同一种组织中其他模态的轮廓不可用的情况下,存在其局限性。在本研究中,我们探讨了使用迁移学习结合Kullback-Leibler(KL)距离来训练预测模型,以完成覆盖率极低(低于2%)的甲基化组轮廓。迁移学习被用来利用不同组织中通常可用的较少稀疏的轮廓,而KL距离则被用来最大化利用输入数据所携带的信息。采用深度神经网络提取DNA序列和局部甲基化模式以进行插补。我们针对牛卵母细胞和早期胚胎的甲基化组轮廓完成模型的训练研究表明,迁移学习和KL距离的有效性,分别使预测性能提高了29.98%和29.43%,当两者结合使用时,预测性能提高了38.70%。插补后的数据覆盖率显著增加(43.80–73.6%),这为下游分析提供了强大的支持,这些分析在插补前使用极低覆盖率轮廓(0.06–1.47%)无法有效进行。
提供机构:
Frontiers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作