Data from: Transcriptome sequencing of sea cucumber (Apostichopus japonicus) and the identification of gene-associated markers
收藏DataONE2013-06-28 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Sea cucumber (Apostichopus japonicus) is an ecologically and economically important species in East and South-East Asia. This project aimed to identify large numbers of gene-associated markers and differentially expressed genes (DEGs) after lipopolysaccharides (LPS) challenge in A. japonicus using high-throughput transcriptome sequencing. A total of 162 million high-quality reads of 174 million raw reads were obtained by deep sequencing using Illumina HiSeq™ 2000 platform. Assembly of these reads generated 94 704 unigenes, with read length ranging from 200 to 16 153 bp (average length of 810 bp). A total of 36 005 were identified as coding sequences (CDSs), 32 479 of which were successfully annotated. Based on the assembly transcriptome, we identified 142 511 high-quality single nucleotide polymorphisms (SNPs). Among them, 33 775, 63 120 and 45 616 were located in sequences without predicted CDS (non-CDSs), CDSs and untranslated regions (UTRs), respectively. These putative SNPs included 82 664 transitions and 59 847 transversions. Totally, 89 375 (59.1%) were distributed in 15 473 known genes. A total of 6417 microsatellites were detected in 5970 unigenes, 3216 of which were annotated and 2481 were successfully subjected for primer design. The numbers of simple sequence repeats (SSRs) identified in non-CDSs, CDSs and UTRs were 2367, 2316 and 1734. These potential SNPs and SSRs are expected to provide abundant resources for genetic, evolutionary and ecological studies in sea cucumber. Transcriptome comparison revealed 1330, 1347 and 1291 DEGs in the coelomocytes of A. japonicus at 4 h, 24 h and 72 h after LPS challenge, respectively. Approximately 58.4% (1802) of total DEGs have been successfully annotated.
刺参(Apostichopus japonicus)是东亚及东南亚地区兼具重要生态与经济价值的物种。本研究旨在通过高通量转录组测序技术,鉴定刺参经脂多糖(lipopolysaccharides, LPS)刺激后,大量的基因关联标记与差异表达基因(differentially expressed genes, DEGs)。本研究依托Illumina HiSeq™ 2000平台开展深度测序,从1.74亿条原始读段中获取了1.62亿条高质量读段。对上述读段进行组装后,共得到94704条单基因簇(unigenes),其长度范围为200 bp至16153 bp,平均长度为810 bp。研究共鉴定出36005条编码序列(coding sequences, CDSs),其中32479条成功获得功能注释。基于组装完成的转录组,本研究共识别出142511个高质量单核苷酸多态性(single nucleotide polymorphisms, SNPs)位点。其中,33775个位点位于无预测编码序列的区域(non-CDSs),63120个位于编码序列区,45616个位于非翻译区(untranslated regions, UTRs)。这些潜在单核苷酸多态性位点包含82664个转换型变异与59847个颠换型变异。总计89375个(占比59.1%)单核苷酸多态性位点分布于15473个已知基因中。本研究在5970条单基因簇中检测到6417个微卫星标记,其中3216个获得功能注释,2481个成功设计了特异性引物。在非编码序列区、编码序列区与非翻译区中鉴定到的简单序列重复(simple sequence repeats, SSRs)数量分别为2367、2316与1734。上述潜在的单核苷酸多态性位点与简单序列重复标记,有望为刺参的遗传、进化与生态学研究提供丰富的分子资源。转录组比较分析显示,刺参体腔细胞在脂多糖刺激后4小时、24小时与72小时三个时间点,分别鉴定出1330、1347与1291个差异表达基因。总计约58.4%(1802个)的差异表达基因成功获得功能注释。
创建时间:
2013-06-28



