five

Pairwise graph edit distance characterizes the impact of the construction method on pangenome graphs

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10932489
下载链接
链接失效反馈
官方服务:
资源简介:
Graph edition is a vastly studied subject, with many heuristics to compare topologies, and many NP-hard problems. Here, we present a method, relying on the specificities of what a pangenome graph is (a collection of subsequences linked by edges, that represents the embedding of genomes inside a graph structure) to formulate a O(n) solution in this specific case. It allows us to pinpoint dissimilarities between graphs, and we can analyse how such graphs differ when build with different tools, or parameters. Data description: Files `mgc_yeast_1` and `pggb_yeast_1`: These files are the ones referenced in Tab. 1. The `.gfa` file is the graph file and the `.vcf` file is the variant file, obtained using `vg deconstruct`. Archive `yeast_dataset`: Contains the raw `.fasta` genomes used to build the yeast chromosome 1 graphs described in the publication. Archive `json_datasets_results`: Contains the computed distance, variants, and sequence complexity analysis results as `.json` files.  Archive `human_data`: Contains the human graphs (chromosome 1 and 21) referenced in Tab. 1. as well as the `.vcf` files. Human graphs are directly taken from Liao et al. 2023. The Minigraph-Cactus graph is the `.full.og` version of the CHM13-based pangenome and did undergo a conversion in `odgi` to `.gfa` then was given to `vg` to obtain a GFA1.0 file. Archive `reference_impact`: Contains the `.gfa` graphs used for the comparison of the impact of the reference choice against the secondary genome order in Minigraph-Cactus (fig 1A of the article). Warning: all graphs are given as they came out of the Minigraph-Cactus and PGGB pipelines. It means, as `rs-pancat-compare` can compare only GFA1.0 that you must perform conversion using the `vg toolkit` (see commands available on this GitHub). Archive `mgc_vs_pggb`: Contains the `.gfa` graphs used for the comparison of the impact of the reference choice in Minigraph-Cactus against PGGB (fig 1B of the article). Warning: all graphs are given as they came out of the Minigraph-Cactus and PGGB pipelines. It means, as `rs-pancat-compare` can compare only GFA1.0 that you must perform conversion using the `vg toolkit` (see commands available on this GitHub)
创建时间:
2024-12-09
二维码
社区交流群
二维码
科研交流群
商业服务