five

Sequencing and Analysis of Full-Length cDNAs, 5′-ESTs and 3′-ESTs from a Cartilaginous Fish, the Elephant Shark (Callorhinchus milii)

收藏
Figshare2016-01-19 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Sequencing_and_Analysis_of_Full_Length_cDNAs_5_ESTs_and_3_ESTs_from_a_Cartilaginous_Fish_the_Elephant_Shark_Callorhinchus_milii_/118779
下载链接
链接失效反馈
官方服务:
资源简介:
Cartilaginous fishes are the most ancient group of living jawed vertebrates (gnathostomes) and are, therefore, an important reference group for understanding the evolution of vertebrates. The elephant shark (Callorhinchus milii), a holocephalan cartilaginous fish, has been identified as a model cartilaginous fish genome because of its compact genome (∼910 Mb) and a genome project has been initiated to obtain its whole genome sequence. In this study, we have generated and sequenced full-length enriched cDNA libraries of the elephant shark using the ‘oligo-capping’ method and Sanger sequencing. A total of 6,778 full-length protein-coding cDNA and 10,701 full-length noncoding cDNA were sequenced from six tissues (gills, intestine, kidney, liver, spleen, and testis) of the elephant shark. Analysis of their polyadenylation signals showed that polyadenylation usage in elephant shark is similar to that in mammals. Furthermore, both coding and noncoding transcripts of the elephant shark use the same proportion of canonical polyadenylation sites. Besides BLASTX searches, protein-coding transcripts were annotated by Gene Ontology, InterPro domain, and KEGG pathway analyses. By comparing elephant shark genes to bony vertebrate genes, we identified several ancient genes present in elephant shark but differentially lost in tetrapods or teleosts. Only ∼6% of elephant shark noncoding cDNA showed similarity to known noncoding RNAs (ncRNAs). The rest are either highly divergent ncRNAs or novel ncRNAs. In addition to full-length transcripts, 30,375 5′-ESTs and 41,317 3′-ESTs were sequenced and annotated. The clones and transcripts generated in this study are valuable resources for annotating transcription start sites, exon-intron boundaries, and UTRs of genes in the elephant shark genome, and for the functional characterization of protein sequences. These resources will also be useful for annotating genes in other cartilaginous fishes whose genomes have been targeted for whole genome sequencing.

软骨鱼类(Cartilaginous fishes)是现存最古老的有颌脊椎动物(gnathostomes)类群,因此是解析脊椎动物演化历程的重要参照类群。象鲨(Callorhinchus milii)作为一种全头亚纲软骨鱼类,因其紧凑的基因组(约910 Mb)被确立为软骨鱼类基因组研究的模式物种,其全基因组测序项目也已启动。 本研究采用寡帽法(oligo-capping)结合桑格测序(Sanger sequencing),构建并测序了象鲨的全长富集cDNA文库。从象鲨的6种组织(鳃、肠、肾、肝、脾和睾丸)中共获得6778条全长蛋白编码cDNA与10701条全长非编码cDNA并完成测序。 对其多聚腺苷酸化信号的分析显示,象鲨的多聚腺苷酸化使用模式与哺乳动物高度相似。此外,象鲨的编码转录本与非编码转录本均采用相同比例的经典多聚腺苷酸化位点。 除通过BLASTX进行比对检索外,本研究还通过基因本体(Gene Ontology)、InterPro结构域(InterPro domain)及KEGG通路(KEGG pathway)分析,对蛋白编码转录本完成注释。通过将象鲨基因与硬骨脊椎动物基因进行比较,我们鉴定出若干存在于象鲨中,但在四足动物或真骨鱼类(teleosts)中发生差异性丢失的古老基因。 仅有约6%的象鲨非编码cDNA与已知非编码RNA(ncRNAs)存在序列相似性,其余序列要么属于高度分化的非编码RNA,要么为新型非编码RNA。 除全长转录本外,本研究还测序并注释了30375条5′端表达序列标签(ESTs)与41317条3′端表达序列标签(ESTs)。 本研究获得的克隆与转录本资源,可为象鲨基因组中基因的转录起始位点、外显子-内含子边界以及非翻译区(UTR)的注释工作,以及蛋白质序列的功能表征提供宝贵支撑。该资源同样可用于其他已启动全基因组测序计划的软骨鱼类的基因注释研究。
创建时间:
2016-01-19
二维码
社区交流群
二维码
科研交流群
商业服务