five

GraphMB

收藏
OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/GraphMB
下载链接
链接失效反馈
官方服务:
资源简介:
GraphMB背后的主要思想是基于contig特定的特征和装配图生成嵌入,然后将其聚类到bin中并根据完整性和污染进行评估。群集嵌入而不是原始特征的优点是,这些嵌入具有较小的维度,并且可以对原始特征中潜在的关系进行编码。我们通过将装配图合并到训练过程中来改进现有的binners。汇编图描述了连接了哪些重叠群,以及支持该连接的读取次数 (读取覆盖)。我们使用此信息来训练GNN,并生成考虑重叠群附近的嵌入。图1提供了GraphMB的概述,以下各节解释了该过程的每个步骤。

The core idea behind GraphMB is to generate embeddings based on contig-specific features and assembly graphs, followed by clustering these embeddings into bins and evaluating the resulting bins based on their completeness and contamination. The advantage of clustering embeddings rather than raw features is that these embeddings have lower dimensionality and can encode the latent relationships inherent in the original features. We improve existing binners by integrating assembly graphs into the training process. An assembly graph describes which contigs are connected, along with the number of reads supporting each connection (read coverage). We use this information to train Graph Neural Networks (GNNs) and generate embeddings that take into account the neighborhood of each contig. Figure 1 provides an overview of GraphMB, and the following sections elaborate on each step of this workflow.
提供机构:
OpenDataLab
创建时间:
2022-10-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作