five

Assembly and validation of conserved long non-coding RNAs in the ruminant transcriptome

收藏
DataCite Commons2023-04-27 更新2025-04-17 收录
下载链接:
https://datashare.ed.ac.uk/handle/10283/2995
下载链接
链接失效反馈
官方服务:
资源简介:
mRNA-like long non-coding RNAs (lncRNA) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. This dataset demonstrates that few lncRNA are fully captured by biological replicates of the same RNA-seq library. In a transcriptional atlas of the domestic sheep (https://doi.org/10.1371/journal.pgen.1006997), 31 diverse tissues/cell types were sampled in each of 6 individual adults (3 females, 3 males, all unrelated virgin animals approximately 2 years of age). By taking a subset of 31 common tissues per individual, each of the 6 adults (f1, f2, f3, m1, m2, and m3) was represented by ~0.75 billion reads. In a typical lncRNA assembly pipeline, read alignments from all individuals are merged, to maximise the number of candidate gene models (using, for instance, StringTie --merge). With n = 6 adults (and ~0.75 billion reads per adult), there are (2^n)-1 = 63 possible combinations of data for which GTFs can be made with StringTie --merge. This dataset comprises those GTFs.

类mRNA长链非编码RNA(mRNA-like long non-coding RNAs,lncRNA)是哺乳动物转录组的重要组成部分,尽管多数仅以低水平表达,且具有高度组织特异性,或仅在特定发育阶段表达。本数据集表明,仅通过同一RNA测序(RNA-seq)文库的生物学重复实验,仅有极少部分lncRNA能够被完全捕获。在针对家绵羊的转录组图谱研究(https://doi.org/10.1371/journal.pgen.1006997)中,研究人员对6只成年个体(3只雌性、3只雄性,均为约2岁的未交配无关个体)的31种不同组织/细胞类型进行了采样。每只个体均选取31种共有组织,最终6只成年个体(编号依次为f1、f2、f3、m1、m2、m3)的测序数据量均约为7.5亿条读段(reads)。在典型的lncRNA组装流程中,研究人员通常会合并所有个体的测序比对结果,以最大化候选基因模型的数量(例如使用StringTie --merge工具)。当n=6只成年个体(单个体测序数据量约为7.5亿条读段)时,可通过StringTie --merge生成基因转移格式(Gene Transfer Format,GTF)文件的数据集组合共有(2ⁿ)-1=63种。本数据集即包含上述全部63种GTF文件。
提供机构:
Roslin Institute. University of Edinburgh
创建时间:
2018-01-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作