Apis mellifera graph genome
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7736363
下载链接
链接失效反馈官方服务:
资源简介:
AmelGraph 1.1.0
We aligned 5 different publicly available assemblies with the cactus pangenome workflow (v2.0.5):
Use
Species
Genome ID
Accession
Graph ID
Size (Mb)
Reference
A. mellifera (DH4)
Amel_HAv3.1
GCF_003254395.2
DH4
225.2
Derivate
A. m. mellifera
INRA_AMelMel_1.0
GCA_003314205.1
mellifera
227.0
Derivate
A. m. carnica
ASM1384124v2
GCA_013841245.2
carnica
226.0
Derivate
A. m. caucasica
ASM1384120v1
GCA_013841205.1
caucasica
224.8
Derivate
A. m. ligustica
ASM1932182v1
GCA_019321825.1
ligustica
231.1
We used the cactus1 pangenome workflow to generate a 5-ways pangenome alignment, masking in blocks of 10Kb, and considering all the sequences in the different data. Due to incompatibilities with the pangenome workflow generation of the indexes, we regenerated the giraffe indexes from the output GFA/VCF file. We also downloaded annotations available for three genomes (DH4, A. m. carnica and A. m. caucasica), modifying the naming of the contigs to include the subspecies name (i.e. 'LG1' for A. m. caucasica modified to 'caucasica.LG1'). We then ran the vg autoindex function:
vg autoindex -w giraffe -o pangenome -t 8 -T ./TMP -x carnica.gff -x caucasica.gff -R XG -x DH4.gff
We compared this 'full' cactus graph to one built using only sequence on the linkage groups, and to one generated using the PGGB workflow. We evaluated the different graphs using PGGE. This full cactus graph returned the highst aligned identity, sequence matches, and unique alignments, while having the lowest number of multiple mapping and missing alignments.
Parameter
CACTUS (LG)
CACTUS (FULL)
PGGB (norm)
HAv3.1 size
221,626,419
225,250,884
225,250,884
Graph length (bp)
243,077,200
246,675,144
315,672,860
Extra sequence
21,450,781
21,424,260
90,421,976
# nodes
17,120,383
11,664,080
17,952,597
# edges
21,385,634
15,990,593
21,702,958
References
1. Armstrong, J. et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587, (2020).
创建时间:
2023-03-29



