five

Dryococelus australis genome assembly supplementary data

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.905qfttqc
下载链接
链接失效反馈
官方服务:
资源简介:
We present a chromosome-scale genome assembly for the critically endangered Lord Howe Island stick insect Dryococelus australis. Contained in this repository are the original unfiltered annotation .gff file, the .fasta file of repeat families identified by RepeatScout, and the .gff file of repetitive elements throughout the genome assembly. Methods Repeat families were identified de novo and classified using the software package RepeatModeler v2.0.1. We further filtered repeats by a BLAST search to the `nr` database and removed any unclassified family whose best hit was a known protein not originating from a transposable element, including mitochondrial proteins. These repeat families were then annotated in the genome assembly using RepeatMasker v2.0.1. Coding sequences from Clitarchus hookeri, Medauroidea extradentata, and Timema cristinae were used to train the ab initio model for D. australis using both AUGUSTUS software v2.5.5 and SNAP version 2006-07-28. We extracted RNA from tissue samples of both sexes of the ventral ganglion, leg muscle, gut lumen, Malpighian tubules, and gonads with RNEasy spin-column kits (Qiagen); libraries were prepared and sequenced at BGI Hong Kong on a BGISEQ-500 instrument in paired-end mode with 150 bp reads. Reads were mapped onto the assembly using STAR v2.7 and intron hints generated with the bam2hints tools within the AUGUSTUS software. SNAP and AUGUSTUS (with intron-exon boundary hints provided from RNA-Seq) were then used to predict for genes in the repeat-masked assembly. Only gene models that were predicted by both SNAP and AUGUSTUS were retained. Genes were further characterised for their putative function by performing a BLAST search of the peptide sequences against a set of protein sequences from UniProt. All raw sequence data have been deposited into NCBI's SRA database under BioProject PRJNA930028.
创建时间:
2023-02-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作