Jute genome sequencing: an Indian initiative
收藏DataCite Commons2025-06-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Jute_genome_sequencing_an_Indian_initiative/14812848/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset pertains to whole genome shot-gun (WGS) sequencing of <i>Corchorus olitorius</i> cv. JRO-524 (Navin), a pioneering Indian tossa jute variety. The JRO-524 draft genome (DDBJ/EMBL/GenBank: LLWS00000000.1; SRA: SRX1506532; BioProject: PRJNA278717) was assembled using Illumina MiSeq platform (merged 2 x 250 bp overlapping sequence reads) and comprised 52,371 contigs with total assembly size of 377,376,943 bp. The JRO-524 repeat library was constructed by searching for MITEs and LTRs followed by <i>de-novo </i>identification and collection of most repetitive sequences by RepeatModeler v1.0.10. For genome annotation using RNA-Seq evidence, the genome was soft-masked for all known and unknown repeats using RepeatMasker v4.0.7 followed by alignment with 454 RNA-Seq reads (DDBJ/EMBL/GenBank TSA: GFDJ00000000.1; SRA: SRR5145920) of <i>C. olitorius</i> cv. Sudan Green, one of the parents of JRO-524. Genes were predicted using WebAUGUSTUS with intron hints from RNA-Seq alignments and <i>Theobroma cacao</i> as the closest species of <i>C. olitorius</i>, functionally annotated using the pipeline (blastx, GO mapping, annotation, InterProScan and GO Slim) as implemented in Blast2GO/OmicsBox and summarized for GO functional annotations and classifications using WEGO 2.0. Non-coding RNAs were identified using Infernal v1.1.2 by querying the Rfam database, while the distributions of simple sequence repeats (SSRs) were analyzed by GMATA v2.2.1. Genome assembly and annotation completeness was assessed by Benchmarking Universal Single Copy Orthologs (BUSCO) scores (lineage dataset: eudicotyledons_odb10). All gene prediction and functional annotation data together with BUSCO results and other source files are reposited here.
提供机构:
figshare
创建时间:
2021-06-20



