five

Apis mellifera graph genome

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7736363
下载链接
链接失效反馈
官方服务:
资源简介:
AmelGraph 1.1.0 We aligned 5 different publicly available assemblies with the cactus pangenome workflow (v2.0.5): Use  Species  Genome ID  Accession  Graph ID  Size (Mb)  Reference  A. mellifera (DH4)  Amel_HAv3.1  GCF_003254395.2  DH4  225.2  Derivate  A. m. mellifera  INRA_AMelMel_1.0  GCA_003314205.1  mellifera  227.0  Derivate  A. m. carnica  ASM1384124v2  GCA_013841245.2  carnica  226.0  Derivate  A. m. caucasica  ASM1384120v1  GCA_013841205.1  caucasica  224.8  Derivate  A. m. ligustica  ASM1932182v1  GCA_019321825.1  ligustica  231.1    We used the cactus1 pangenome workflow to generate a 5-ways pangenome alignment, masking in blocks of 10Kb, and considering all the sequences in the different data. Due to incompatibilities with the pangenome workflow generation of the indexes, we regenerated the giraffe indexes from the output GFA/VCF file. We also downloaded annotations available for three genomes (DH4, A. m. carnica and A. m. caucasica), modifying the naming of the contigs to include the subspecies name (i.e. 'LG1' for A. m. caucasica modified to 'caucasica.LG1'). We then ran the vg autoindex function: vg autoindex -w giraffe -o pangenome -t 8 -T ./TMP -x carnica.gff -x caucasica.gff -R XG -x DH4.gff We compared this 'full' cactus graph to one built using only sequence on the linkage groups, and to one generated using the PGGB workflow. We evaluated the different graphs using PGGE. This full cactus graph returned the highst aligned identity, sequence matches, and unique alignments, while having the lowest number of multiple mapping and missing alignments.   Parameter  CACTUS (LG)  CACTUS (FULL)  PGGB (norm)  HAv3.1 size  221,626,419  225,250,884  225,250,884  Graph length (bp)  243,077,200  246,675,144  315,672,860  Extra sequence  21,450,781  21,424,260  90,421,976  # nodes  17,120,383  11,664,080  17,952,597  # edges  21,385,634  15,990,593  21,702,958    References 1.  Armstrong, J. et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587, (2020).
创建时间:
2023-03-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作