five

Fragmentary Sequences for Variable-Sized RNAsim Datasets

收藏
DataCite Commons2021-06-21 更新2025-04-16 收录
下载链接:
https://databank.illinois.edu/datasets/IDB-8788479
下载链接
链接失效反馈
官方服务:
资源简介:
Thank you for using these datasets. These RNAsim aligned fragmentary sequences were generated from the query sequences selected by Balaban et al. (2019) in their variable-size datasets (https://doi.org/10.5061/dryad.78nf7dq). They were created for use for phylogenetic placement with the multiple sequence alignments and backbone trees provided by Balaban et al. (2019). The file structures included here also correspond with the data Balaban et al. (2020) provided. This includes: Directories for five varying backbone tree sizes, shown as 5000, 10000, 50000, 100000, and 200000. These directory names are also used by Balaban et al. (2019), and indicate the size of the backbone tree included in their data. Subdirectories for each replicate from the backbone tree size labelled 0 through 4. For the smaller four backbone tree sizes there are five replicates, and for the largest there is one replicate. Each replicate contains 200 text files with one aligned query sequence fragment in fasta format.

感谢您使用本数据集。本数据集包含的经RNAsim比对的片段序列,源自Balaban等人2019年发布的可变大小数据集(https://doi.org/10.5061/dryad.78nf7dq)中选取的查询序列。本数据集专为结合Balaban等人2019年提供的多序列比对结果与骨干树开展系统发育放置(phylogenetic placement)分析而构建。本次提供的文件结构也与Balaban等人2020年发布的数据保持一致,具体内容如下: 对应5种不同骨干树规模的目录,规模分别为5000、10000、50000、100000和200000。此类目录的命名方式与Balaban等人2019年的研究一致,目录名称直接代表对应数据中包含的骨干树规模。 各骨干树规模的目录下设有编号0至4的重复样本子目录。其中4个较小规模的骨干树各包含5组重复样本,最大规模的骨干树仅包含1组重复样本。 每组重复样本均包含200个文本文件,每个文件内存储1条FASTA格式的比对查询序列片段。
提供机构:
University of Illinois at Urbana-Champaign
创建时间:
2021-06-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作