Pairwise graph edit distance characterizes the impact of the construction method on pangenome graphs
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10932489
下载链接
链接失效反馈官方服务:
资源简介:
Graph edition is a vastly studied subject, with many heuristics to compare topologies, and many NP-hard problems. Here, we present a method, relying on the specificities of what a pangenome graph is (a collection of subsequences linked by edges, that represents the embedding of genomes inside a graph structure) to formulate a O(n) solution in this specific case. It allows us to pinpoint dissimilarities between graphs, and we can analyse how such graphs differ when build with different tools, or parameters.
Data description:
Files `mgc_yeast_1` and `pggb_yeast_1`:
These files are the ones referenced in Tab. 1. The `.gfa` file is the graph file and the `.vcf` file is the variant file, obtained using `vg deconstruct`.
Archive `yeast_dataset`:
Contains the raw `.fasta` genomes used to build the yeast chromosome 1 graphs described in the publication.
Archive `json_datasets_results`:
Contains the computed distance, variants, and sequence complexity analysis results as `.json` files.
Archive `human_data`:
Contains the human graphs (chromosome 1 and 21) referenced in Tab. 1. as well as the `.vcf` files. Human graphs are directly taken from Liao et al. 2023. The Minigraph-Cactus graph is the `.full.og` version of the CHM13-based pangenome and did undergo a conversion in `odgi` to `.gfa` then was given to `vg` to obtain a GFA1.0 file.
Archive `reference_impact`:
Contains the `.gfa` graphs used for the comparison of the impact of the reference choice against the secondary genome order in Minigraph-Cactus (fig 1A of the article).
Warning: all graphs are given as they came out of the Minigraph-Cactus and PGGB pipelines. It means, as `rs-pancat-compare` can compare only GFA1.0 that you must perform conversion using the `vg toolkit` (see commands available on this GitHub).
Archive `mgc_vs_pggb`:
Contains the `.gfa` graphs used for the comparison of the impact of the reference choice in Minigraph-Cactus against PGGB (fig 1B of the article).
Warning: all graphs are given as they came out of the Minigraph-Cactus and PGGB pipelines. It means, as `rs-pancat-compare` can compare only GFA1.0 that you must perform conversion using the `vg toolkit` (see commands available on this GitHub)
创建时间:
2024-12-09



