five

Jute genome sequencing: an Indian initiative

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Jute_genome_sequencing_an_Indian_initiative/14812848
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset pertains to whole genome shot-gun (WGS) sequencing of Corchorus olitorius cv. JRO-524 (Navin), a pioneering Indian tossa jute variety. The JRO-524 draft genome (DDBJ/EMBL/GenBank: LLWS00000000.1; SRA: SRX1506532; BioProject: PRJNA278717) was assembled using Illumina MiSeq platform (merged 2 x 250 bp overlapping sequence reads) and comprised 52,371 contigs with total assembly size of 377,376,943 bp. The JRO-524 repeat library was constructed by searching for MITEs and LTRs followed by de-novo identification and collection of most repetitive sequences by RepeatModeler v1.0.10. For genome annotation using RNA-Seq evidence, the genome was soft-masked for all known and unknown repeats using RepeatMasker v4.0.7 followed by alignment with 454 RNA-Seq reads (DDBJ/EMBL/GenBank TSA: GFDJ00000000.1; SRA: SRR5145920) of C. olitorius cv. Sudan Green, one of the parents of JRO-524. Genes were predicted using WebAUGUSTUS with intron hints from RNA-Seq alignments and Theobroma cacao as the closest species of C. olitorius, functionally annotated using the pipeline (blastx, GO mapping, annotation, InterProScan and GO Slim) as implemented in Blast2GO/OmicsBox and summarized for GO functional annotations and classifications using WEGO 2.0. Non-coding RNAs were identified using Infernal v1.1.2 by querying the Rfam database, while the distributions of simple sequence repeats (SSRs) were analyzed by GMATA v2.2.1. Genome assembly and annotation completeness was assessed by Benchmarking Universal Single Copy Orthologs (BUSCO) scores (lineage dataset: eudicotyledons_odb10). All gene prediction and functional annotation data together with BUSCO results and other source files are reposited here.
创建时间:
2021-06-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作