five

Genomic insights into the chromosomal elongation in a family of Collembola

收藏
DataONE2024-01-29 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:68977a32e8a3da13e21675400a8d6df86b07ecfbcdb84bd191cc177df352550f
下载链接
链接失效反馈
官方服务:
资源简介:
Collembola is a highly diverse and abundant group of soil arthropods with chromosome numbers ranging from 5 to 11. Previous karyotype studies indicated that the Tomoceridae family possesses an exceptionally long chromosome. To better understand chromosome size evolution in Collembola, we obtained a chromosome-level genome of Yoshiicerus persimilis with a size of 334.44 Mb and BUSCO completeness of 97.0% (n = 1,013). Both genomes of Y. persimilis and Tomocerus qinae (recently published) have an exceptionally large chromosome (ElChr >100 Mb), accounting for nearly one-third of the genome. Comparative genomic analyses suggest that chromosomal elongation occurred independently in two species approximately 10 million years ago, rather than in the ancestor of the Tomoceridae family. The ElChr elongation was caused by large tandem and segmental duplications, as well as transposon proliferation, with genes in these regions experiencing weaker purifying selection (higher dN/dS) than conserved..., Genome assembly De novo assembly of PacBio long reads was performed by Raven v. 1.6.0. The assembly was then polished with one round of long reads using Flye v. 2.8.3 and two rounds of Illumina short reads using NextPolish v. 1.3.1. Primary contigs were anchored into chromosomes using 3D-DNA v. 180922. Genome annotation We used the MAKER v. 3.01.03 to predict PCGs, which integrates ab initio, RNA-seq, and protein homology evidence. BRAKER v. 2.1.6  and GeMoMa v. 1.7.1 predictions combining protein and transcriptome evidence were integrated as the ab initio input passed to MAKER. BRAKER trained Augustus v. 3.3.4 and GeneMark-ES/ET/EP 4.68_lic integrating evidence from the OrthoDB10 v1 database. GeMoMa with parameters “GeMoMa.c = 0.3 GeMoMa.p = 12” utilized eight species (Daphnia magna, Cloeon dipterum, Zootermopsis nevadensis, Drosophila melanogaster, Rhopalosiphum maidis, Tribolium castaneum, Sinella curviseta, and FCSH) as the protein homology-based reference. RNA-seq alignments were p..., The sequencing reads are deposited at NCBI (SRR13480398–SRR13480401 and SRR25299242) under BioProject PRJNA630033. The genome assembly is deposited at GenBank under accessions JABJWA000000000. Additionally, the results of annotation for repeated sequences, gene structure, and functional prediction have been deposited in Figshare (https://doi.org/10.6084/m9.figshare.23722086)., genome.fa.masked.gz Repeat-masked genome assembly repeats.gff.gz Repeat annotation iprscan.tsv.gz InterProscan results eggnog.emapper.annotations.gz eggNOG annotation results gene.maker.gff.gz Annotation file of MAKER-annotated protein-coding genes cds.maker.fasta.gz Coding sequences of MAKER-annotated protein-coding genes proteins.maker.fasta.gz Amino-acid sequences of MAKER-annotated protein-coding genes transcripts.maker.fasta.gz Transcripts of MAKER-annotated protein-coding genes
创建时间:
2025-07-26
二维码
社区交流群
二维码
科研交流群
商业服务