Fasting northern elephant seal pup blubber transcriptome
收藏DataCite Commons2020-08-31 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/Fasting_northern_elephant_seal_pup_blubber_transcriptome/5746227/1
下载链接
链接失效反馈官方服务:
资源简介:
De novo Trinity transcriptome assembly from blubber (adipose tissue) collected from weaned northern elephant seal (Mirounga angustirostris, NES) pups during their post-weaning fast. Samples were collected from independent cohorts of 6 pups each, "early fasting" (1-2 weeks post weaning) and "late fasting" (6-8 weeks post weaning) and total RNA was isolated (RIN: 7.6-9.0). Twelve strand-specific libraries were prepared according to Illumina protocol with Ribo-zero depletion (human/rat/mouse), and sequenced (125 bp paired-end reads) on one lane of Illumina HiSeq 2000, producing an average of 41.5 million reads per sample. De novo assembly was conducted using all 12 samples and Trinity v2.1.1 with default settings, including adapter trimming ,but without abundance normalization. The assembly contains 1,830,330 transcripts (contigs) in 1,635,200 gene clusters with mean and median contig length of 835 bp and 425 bp, respectively. Mapping rate of reads to assembly was 87% and the assembly contains 81.2% complete, 36.3% duplicated, and 15.5% fragmented vertebrate BUSCOs, with only 3.2% missing from this dataset. The assembly was annotated by BLASTx using DIAMOND v0.8.31 with "very sensitive" option against the UniProt/SwissProt database (downloaded 8/20/17), with e-value threshold of 0.001. We identified 389,363 homologs in the transcriptome. The raw assembly file is called FastingNESPupTrinityAssembly.gz and the annotation file is called FastingNESPupAssemblyAnnotation.xlsx (also available as a tab delimited text file).
本数据集为针对北象海豹(Mirounga angustirostris,NES)幼崽断奶后禁食阶段的鲸脂(脂肪组织)样本开展的Trinity从头转录组组装。样本采集自两个独立队列,每组各6只幼崽,分别为"早期禁食组"(断奶后1~2周)与"晚期禁食组"(断奶后6~8周);并完成总RNA提取,其RNA完整性数(RIN)为7.6~9.0。按照Illumina标准实验流程,采用Ribo-zero(人/大鼠/小鼠来源)试剂盒去除核糖体RNA,构建12个链特异性文库;随后在Illumina HiSeq 2000的单条测序泳道上进行测序,获取125 bp双端读段,每个样本平均产出4150万条读段。使用全部12个样本,以默认参数(包含接头修剪但不进行丰度标准化)通过Trinity v2.1.1完成从头组装。本次组装共得到1,635,200个基因簇,包含1,830,330条转录本(重叠群contig),重叠群的平均长度与中位长度分别为835 bp与425 bp。读段比对至该组装序列的比对率为87%;该组装的脊椎动物BUSCO(通用单拷贝直系同源基因基准集)评估结果显示,81.2%为完整序列、36.3%为重复序列、15.5%为片段化序列,仅3.2%的BUSCO未被检出。采用DIAMOND v0.8.31的"very sensitive"(高精度)模式,以BLASTx算法比对至UniProt/SwissProt数据库(2017年8月20日下载),设置E-value阈值为0.001,完成组装序列注释;最终在该转录组中鉴定得到389,363个同源基因。原始组装文件名为FastingNESPupTrinityAssembly.gz,注释文件名为FastingNESPupAssemblyAnnotation.xlsx(同时提供制表符分隔的文本文件版本)。
提供机构:
figshare
创建时间:
2018-01-20



