Data from: De novo assembly and characterization of four anthozoan (phylum Cnidaria) transcriptomes
收藏DataONE2015-09-21 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Many nonmodel species exemplify important biological questions but lack the sequence resources required to study the genes and genomic regions underlying traits of interest. Reef-building corals are famously sensitive to rising seawater temperatures, motivating ongoing research into their stress responses and long-term prospects in a changing climate. A comprehensive understanding of these processes will require extending beyond the sequenced coral genome (Acropora digitifera) to encompass diverse coral species and related anthozoans. Toward that end, we have assembled and annotated reference transcriptomes to develop catalogs of gene sequences for three scleractinian corals (Fungia scutaria, Montastraea cavernosa, Seriatopora hystrix) and a temperate anemone (Anthopleura elegantissima). High-throughput sequencing of cDNA libraries produced ∼20-30 million reads per sample, and de novo assembly of these reads produced ∼75,000-110,000 transcripts from each sample with size distributions (mean ∼1.4 kb, N50 ∼2 kb), comparable to the distribution of gene models from the coral genome (mean ∼1.7 kb, N50 ∼2.2 kb). Each assembly includes matches for more than half the gene models from A. digitifera (54-67%) and many reasonably complete transcripts (∼5300-6700) spanning nearly the entire gene (ortholog hit ratios ≥0.75). The catalogs of gene sequences developed in this study made it possible to identify hundreds to thousands of orthologs across diverse scleractinian species and related taxa. We used these sequences for phylogenetic inference, recovering known relationships and demonstrating superior performance over phylogenetic trees constructed using single mitochondrial loci. The resources developed in this study provide gene sequences and genetic markers for several anthozoan species. To enhance the utility of these resources for the research community, we developed searchable databases enabling researchers to rapidly recover sequences for genes of interest. Our analysis of de novo assembly quality highlights metrics that we expect will be useful for evaluating the relative quality of other de novo transcriptome assemblies. The identification of orthologous sequences and phylogenetic reconstruction demonstrates the feasibility of these methods for clarifying the substantial uncertainties in the existing scleractinian phylogeny.
诸多非模式生物(nonmodel species)可为重要生物学问题的研究提供优质范例,但却缺乏用于探究其目标性状相关基因及基因组区域的序列资源。造礁珊瑚(reef-building corals)以对海水温度升高极为敏感而闻名,这促使学界持续开展其应激反应及气候变化背景下长期生存前景的相关研究。要全面阐明这些过程,研究范畴不能仅局限于已完成全基因组测序的指形鹿角珊瑚(Acropora digitifera)基因组,还需覆盖多样的珊瑚物种及相关珊瑚虫纲类群(anthozoans)。为此,我们组装并注释了参考转录组,为三种石珊瑚(scleractinian corals)——盾形蕈珊瑚(Fungia scutaria)、洞穴脑珊瑚(Montastraea cavernosa)、柱形管孔珊瑚(Seriatopora hystrix)以及一种温带海葵——优雅海葵(Anthopleura elegantissima)——构建了基因序列目录。对cDNA文库开展高通量测序后,每个样本可获得约2000万至3000万条reads;经从头组装(de novo assembly)后,每个样本可得到约7.5万至11万个转录本,其长度分布(平均长度约1.4 kb,N50值约2 kb)与已公布的指形鹿角珊瑚基因组基因模型的长度分布(平均长度约1.7 kb,N50值约2.2 kb)相当。每个组装结果均可匹配到指形鹿角珊瑚超过50%的基因模型(匹配率54%~67%),同时包含约5300~6700个近乎完整的转录本,这些转录本可覆盖基因的绝大部分区域,其直系同源匹配比率(ortholog hit ratios)≥0.75。本研究构建的基因序列目录,可用于在多样的石珊瑚类群及相关演化支中鉴定出数百至数千个直系同源基因(orthologs)。我们利用这些序列开展系统发育推断(phylogenetic inference),不仅恢复了已知的物种演化关系,还证明该方法的性能优于仅使用单个线粒体基因座(mitochondrial loci)构建的系统发育树。本研究开发的数据集可为多种珊瑚虫纲物种提供基因序列与遗传标记。为提升这些资源对科研共同体的实用价值,我们搭建了可检索数据库,便于研究人员快速获取目标基因的序列信息。我们对从头组装质量的分析筛选出了一系列评价指标,预计可用于评估其他从头转录组组装的相对质量。直系同源序列的鉴定与系统发育重建实验证明,这些方法可用于解决当前石珊瑚系统发育研究中存在的大量不确定性问题。
创建时间:
2015-09-21



