five

Apis mellifera graph genome

收藏
Mendeley Data2024-05-10 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/7736364
下载链接
链接失效反馈
官方服务:
资源简介:
AmelGraph 1.1.0 We aligned 5 different publicly available assemblies with the cactus pangenome workflow (v2.0.5): Use Species Genome ID Accession Graph ID Size (Mb) Reference A. mellifera (DH4) Amel_HAv3.1 GCF_003254395.2 DH4 225.2 Derivate A. m. mellifera INRA_AMelMel_1.0 GCA_003314205.1 mellifera 227.0 Derivate A. m. carnica ASM1384124v2 GCA_013841245.2 carnica 226.0 Derivate A. m. caucasica ASM1384120v1 GCA_013841205.1 caucasica 224.8 Derivate A. m. ligustica ASM1932182v1 GCA_019321825.1 ligustica 231.1 We used the cactus1 pangenome workflow to generate a 5-ways pangenome alignment, masking in blocks of 10Kb, and considering all the sequences in the different data. Due to incompatibilities with the pangenome workflow generation of the indexes, we regenerated the giraffe indexes from the output GFA/VCF file. We also downloaded annotations available for three genomes (DH4, A. m. carnica and A. m. caucasica), modifying the naming of the contigs to include the subspecies name (i.e. 'LG1' for A. m. caucasica modified to 'caucasica.LG1'). We then ran the vg autoindex function: vg autoindex -w giraffe -o pangenome -t 8 -T ./TMP -x carnica.gff -x caucasica.gff -R XG -x DH4.gff We compared this 'full' cactus graph to one built using only sequence on the linkage groups, and to one generated using the PGGB workflow. We evaluated the different graphs using PGGE. This full cactus graph returned the highst aligned identity, sequence matches, and unique alignments, while having the lowest number of multiple mapping and missing alignments. Parameter CACTUS (LG) CACTUS (FULL) PGGB (norm) HAv3.1 size 221,626,419 225,250,884 225,250,884 Graph length (bp) 243,077,200 246,675,144 315,672,860 Extra sequence 21,450,781 21,424,260 90,421,976 # nodes 17,120,383 11,664,080 17,952,597 # edges 21,385,634 15,990,593 21,702,958 References 1. Armstrong, J. et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587, (2020).
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作