Jute genome sequencing: an Indian initiative
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Jute_genome_sequencing_an_Indian_initiative/14812848
下载链接
链接失效反馈官方服务:
资源简介:
This dataset pertains to whole genome shot-gun (WGS) sequencing of Corchorus olitorius cv. JRO-524 (Navin), a pioneering Indian tossa jute variety. The JRO-524 draft genome (DDBJ/EMBL/GenBank: LLWS00000000.1; SRA: SRX1506532; BioProject: PRJNA278717) was assembled using Illumina MiSeq platform (merged 2 x 250 bp overlapping sequence reads) and comprised 52,371 contigs with total assembly size of 377,376,943 bp. The JRO-524 repeat library was constructed by searching for MITEs and LTRs followed by de-novo identification and collection of most repetitive sequences by RepeatModeler v1.0.10. For genome annotation using RNA-Seq evidence, the genome was soft-masked for all known and unknown repeats using RepeatMasker v4.0.7 followed by alignment with 454 RNA-Seq reads (DDBJ/EMBL/GenBank TSA: GFDJ00000000.1; SRA: SRR5145920) of C. olitorius cv. Sudan Green, one of the parents of JRO-524. Genes were predicted using WebAUGUSTUS with intron hints from RNA-Seq alignments and Theobroma cacao as the closest species of C. olitorius, functionally annotated using the pipeline (blastx, GO mapping, annotation, InterProScan and GO Slim) as implemented in Blast2GO/OmicsBox and summarized for GO functional annotations and classifications using WEGO 2.0. Non-coding RNAs were identified using Infernal v1.1.2 by querying the Rfam database, while the distributions of simple sequence repeats (SSRs) were analyzed by GMATA v2.2.1. Genome assembly and annotation completeness was assessed by Benchmarking Universal Single Copy Orthologs (BUSCO) scores (lineage dataset: eudicotyledons_odb10). All gene prediction and functional annotation data together with BUSCO results and other source files are reposited here.
创建时间:
2021-06-20



