Complete T2T assembly of the Gila monster (Heloderma suspectum)
收藏Figshare2025-12-04 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Complete_T2T_assembly_of_the_Gila_monster_i_Heloderma_suspectum_i_/30161734
下载链接
链接失效反馈官方服务:
资源简介:
Final genome assembly files associated with the first public, complete T2T squamate genome, Gila monster (Heloderma suspectum).Heloderma_suspectum.final.reference.fasta -- Final reference assembly containing the longest autosomes between both haplotypes, the Z and W chromosomes, as well as the mitogenome (i.e. hap1 + chrW + mtDNA). The pseudoautosomal region (PAR) on chrW is soft masked (lowercase). Useful for studies conducting whole-genome alignments, synteny analysis, etc. This is the version submitted to GenBank.Heloderma_suspectum.final.reference-chrW_PAR-masked.fasta -- Final reference assembly containing the longest autosomes between both haplotypes, the Z and W chromosomes, as well as the mitogenome (i.e. hap1 + chrW + mtDNA). The pseudoautosomal region (PAR) on chrW is hard masked (replaced by "N" characters). Useful for studies conducting interested in read mapping with the intention of calling accurate variants (e.g. indels, SNPs, etc) on the sex chromosomes in female (ZW) samples.Heloderma_suspectum.final.reference.gff3 -- annotation of gene features for the final reference assembly generated by EviAnn (https://github.com/alekseyzimin/EviAnn_release) with RNAseq, Iso-seq, and a custom protein database as evidence. Satellite arrays were hard-masked on chrW prior to gene annotation. Mitogenome annotations were generated via liftoff [v1.6.3] (https://github.com/agshumate/Liftoff) from the reference mitogenome for Heloderma suspectum (NC_008776.1).Heloderma_suspectum.final.reference.assembly-features.bed -- BED file specifying regions of manual genome patching and location of the single remaining gap on chrW.Heloderma_suspectum.final.reference.telo.bed -- BED file specifying locations of terminal telomeres for each chromosome.Heloderma_suspectum.final.reference.PAR.bed -- BED file specifying manually curated locations of the pseudoautosomal regions for both the Z and W chromosomes.Heloderma_suspectum.final.reference.longdust.bed -- BED file specifying Low Complexity Regions (LCRs) in the genome assembly annotated using longdust [v1.4-r97] (https://github.com/lh3/longdust).Heloderma_suspectum.final.reference.aniann.bed -- BED file specifying locations of satellite arrays generated using anianns (https://github.com/marbl/anianns). Also, used to mask chrW prior to gene annotation, i.e. there are currently no gene annotations within these arrays on chrW.Heloderma_suspectum.final.reference.*.longcallD.vcf -- VCF file specifying locations of haplotype-specific small variants and structural variants (SVs) from both ONT and HiFi data using longcallD [v0.0.4] (https://github.com/yangao07/longcallD).Heloderma_suspectum_10-3-25.hap1.HiFi-polished.fasta -- FASTA file containing to complete T2T haplotype assembly for the Gila monster (contains chrZ). Associated manual curation features provided in Heloderma_suspectum_10-3-25.hap1.HiFi-polished.features.bed, repeat elements annotated using EarlGrey in Heloderma_suspectum_10-3-25.hap1.filteredRepeats.gff3, and LCRs in Heloderma_suspectum_10-3-25.hap1.HiFi-polished.longdust.bed (see longdust.bed above).Heloderma_suspectum_10-3-25.hap2.HiFi-polished.fasta -- FASTA file containing to near-complete T2T haplotype assembly for the Gila monster (contains chrW). Associated manual curation features provided in Heloderma_suspectum_10-3-25.hap2.HiFi-polished.features.bed, repeat elements annotated using RepeatMasker using the hap1 EarlGrey library (Heloderma_suspectum_10-3-25.hap1-families.fa.strained.clstrd.fasta.gz) in Heloderma_suspectum_10-3-25.hap2.filteredRepeats.gff3, and LCRs in Heloderma_suspectum_10-3-25.hap2.HiFi-polished.longdust.bed (see longdust.bed above).Initial assembly graph was generated using UL-ONT data only with hifiasm [v0.25.0-r726] (https://github.com/chhylp123/hifiasm) and patched using a separate assembly graph generated using PacBio HiFi and UL-ONT data with verkko [v2.2.1] (https://github.com/marbl/verkko). The assembly was polished using PacBio data aligned to the diploid assembly using minimap2 [v2.28-r1209] (https://github.com/lh3/minimap2) and and freebayes [v1.3.8] (https://github.com/freebayes/freebayes).
创建时间:
2025-12-04



