Data from: First insights into the giant panda (Ailuropoda melanoleuca) blood transcriptome: a resource for novel gene loci and immunogenetics
收藏DataONE2014-12-30 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
The giant panda (Ailuropoda melanoleuca) is one of the most famous flagship species for conservation and its draft genome has recently been assembled. However, the transcriptome is not yet available. In this study, the blood transcriptomes of three pandas were characterized and about 160 million sequencing reads were generated using Illumina HiSeq 2000 paired-end sequencing technology. The assembly yielded 92,598 transcripts with an average length of 1626 bp and N50 length of 2842bp. Based on a sequence similarity search against non-redundant (nr) protein database, a total of 38,522 (41.6%) transcripts were annotated. Of these annotated transcripts, 25,142 and 8272 transcripts were assigned to gene ontology terms and clusters of orthologous group, respectively. A search against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) indicated that 9098 (9.83%) transcripts mapped to 324 KEGG pathways, and the best represented functional categories of pathways were signal transduction and immune system. We have also identified 23,460 microsatellites, 43,560 SNPs as well as 21,456 alternative splicing events in the assembly. Additionally, a total of 24,341 complete open reading frames (ORFs) were detected from the assembly where 1492 ORFs were found to be novel gene loci as these have not been annotated so far in any public database.
大熊猫(Ailuropoda melanoleuca)是最具知名度的旗舰保护物种之一,其草图基因组已于近期完成组装。然而目前其转录组数据仍未公开。本研究针对3只大熊猫的血液转录组开展解析,采用Illumina HiSeq 2000双端测序技术生成了约1.6亿条测序读段(reads)。经组装共获得92598条转录本,平均长度为1626 bp,N50长度为2842 bp。通过与非冗余(non-redundant, nr)蛋白质数据库进行序列相似性比对,共注释得到38522条(占比41.6%)转录本。在这些被注释的转录本中,分别有25142条和8272条被归类至基因本体(Gene Ontology, GO)术语集与直系同源簇(Clusters of Orthologous Groups, COG)分类。针对京都基因与基因组百科全书通路数据库(Kyoto Encyclopedia of Genes and Genomes Pathway database, KEGG)的比对分析显示,9098条(占比9.83%)转录本匹配到324条KEGG通路,其中富集程度最高的通路功能类别为信号转导与免疫系统。本研究同时在该组装序列中鉴定出23460个微卫星位点、43560个单核苷酸多态性(single nucleotide polymorphism, SNP)位点以及21456个可变剪接事件。此外,从该组装序列中共检测到24341个完整开放阅读框(open reading frame, ORF),其中1492个ORF被判定为新基因座——这类序列目前尚未在任何公共数据库中完成注释。
创建时间:
2014-12-30



