Datasets, matrices and supplementary figure for QMaker
收藏DataCite Commons2020-12-24 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101/11
下载链接
链接失效反馈官方服务:
资源简介:
Pfam_datasets.zip: Pfam training and test sets which were used with QMaker.<br>Matrices-normalized.zip: Eight output matrices for LG, Pfam, Pfam-gb, Bird, Insect, Mammal, Plant, and Yeast datasets.<br>Figure S1_new.pdf: The performance of five matrices, i.e., Q.pfam, JTT, LG, WAG on five clade-specific datasets, i.e., Bird, Plant, Insect, Yeast Mammal.<br>Table S1 (Correlations).docx: Correlation values (1000x) between six new matrices and 20 existing matrices, upper half are correlations of frequencies, lower half are correlations of exchangeabilities.<br>sample_training_10alignments.zip, sample_training_10genes.zip: two small datasets and training scripts, one has 10 alignments (these alignments do not share species and are extracted from Pfam dataset, shoulf be trained with option -S), the other has 10 genes of a same species (extracting from Plant dataset, training with option -p). Each dataset will need ~30 mins training time on a 10-core machine.
Pfam_datasets.zip:用于配合QMaker使用的Pfam训练集与测试集。
Matrices-normalized.zip:针对LG、Pfam、Pfam-gb、Bird、Insect、Mammal、Plant及Yeast数据集生成的8个归一化输出矩阵。
Figure S1_new.pdf:展示Q.pfam、JTT、LG、WAG共5种矩阵在Bird、Plant、Insect、Yeast、Mammal这5个支系特异性数据集上的性能表现。
Table S1 (Correlations).docx:包含6种新型矩阵与20种现有矩阵之间的相关系数(已放大1000倍),其中上半部分为频率相关系数,下半部分为交换性相关系数。
sample_training_10alignments.zip、sample_training_10genes.zip:包含两个小型数据集及配套训练脚本,前者包含10条多序列比对结果(这些比对无共享物种,提取自Pfam数据集,应当通过参数-S进行训练),后者包含10个取自Plant数据集的同一物种基因,需通过参数-p进行训练。在10核机器上,每个数据集的训练耗时约30分钟。
提供机构:
figshare
创建时间:
2020-11-01



