five

Gomphocerus sbiricus gene annotation models.

收藏
Figshare2025-05-26 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_i_Gomphocerus_sbiricus_i_gene_annotation_models_/29148779
下载链接
链接失效反馈
官方服务:
资源简介:
The repeat-masked genome was used for the gene model annotation with the BRAKE3 v3.0.8 (Gabriel et al. 2024 Feb 29). To support gene prediction, we used the paired-end RNA-seq reads previously generated for the club-legged grasshopper deposited in NCBI under the BioProject PRJNA525981 and PRJNA1241690 together with Arthropoda protein data from OrthoDB v11 (Kuznetsov et al. 2023), as recommended by the BRAKER3 GitHub page (https://github.com/Gaius-Augustus/BRAKER.git). The published RNA-seq samples (Shah et al. 2019) in the BioProject PRJNA525981 included pool of five individuals: one imago brown female, one imago green female, one imago brown male, one imago green male and one last-instar green female (accession number SRX5491242 and SRX5491243). RNA-seq samples in the BioProject PRJNA1241690 included 11 adult males (accession number XX-XX) and 11 adult females (accession number XX-XX), with RNA extracted specifically from the pronotum tissue.We aligned the each published paired-end RNA-seq dataset to the repeat-masked genome using HISAT2 v2.2 (Kim et al. 2015) with the “--dta” parameter under default settings. We then sorted the resulting BAM files with SAMtools v1.14 and used them alongside the Arthropoda protein database to run BRAKER3. To complement BRAKER3’s predictions, we used Helixer v0.3.5 (Holst et al. 2023) with the flags “--lineage invertebrate --subsequence-length 213840 --overlap-offset 106920 --overlap-core-length 160380 --peak-threshold 0.9 --batch-size 16”. Finally, we complemented the annotation files from BRAKER3 and Helixer using the agat_sp_complement_annotations.pl script from the AGAT v1.3.2 toolkit (Dainat et al. 2024).
创建时间:
2025-05-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作