Dataset for Dense sampling of taxa and characters improves phylogenetic resolution among deltocephaline leafhoppers (Hemiptera: Cicadellidae: Deltocephalinae)
收藏DataCite Commons2023-12-04 更新2024-07-13 收录
下载链接:
https://databank.illinois.edu/datasets/IDB-5060204
下载链接
链接失效反馈官方服务:
资源简介:
The following files were used to reconstruct the phylogeny of the leafhopper subfamily Deltocephalinae, using IQ-TREE v1.6.12 and ASTRAL v 4.10.5. <b>1) taxon_sampling.csv:</b> contains the sequencing ids (1st column) and the taxonomic information (2nd column) of each sample. Sequencing ids were used in the alignment files and partition files. <b>2)concatenated_nt.phy:</b> concatenated nucleotide alignment used for the maximum likelihood analysis of Deltocephalinae by IQ-TREE v1.6.12. The file lists the sequences of 163,365 nucleotide positions from 429 genes in 730 samples. Hyphens are used to represent gaps. <b>3) concatenated_nt_partition.nex:</b> the partitions for the concatenated nucleotide alignment. The file partitions the 163,365 nucleotide characters into 429 character sets, and defines the best substitution model for each character set. <b>4) concatenated_aa.phy:</b> concatenated amino acid alignment used for the maximum likelihood analysis of Deltocephalinae by IQ-TREE v1.6.12. The file gives the sequences of 53,969 amino acids from 429 genes in 730 samples. Hyphens are used to represent gaps. <b>5) concatenated_aa_partition.nex:</b> the partitions for the concatenated amino acid alignment. The file partitions the 53,969 characters into 429 character sets, and defines the best substitution model for each character set. <b>6) concatenated_nt_106taxa.phy:</b> a reduced concatenated nucleotide alignment representing 107 samples x 86 genes. This alignment is used to estimate the divergence times of Deltocephalinae using MCMCTree in PAML v4.9. The file lists the sequences of 79,239 nucleotide positions from 86 genes in 107 samples. Hyphens are used to represent gaps. <b>7) concatenated_nt_106taxa_partition.nex:</b> the partitions for the nucleotide alignment concatenated_nt_106taxa.phy. The file partitions the 79,239 nucleotide characters into 86 character sets, and defines the best substitution model for each character set. <b>8) individual_gene_alignment.zip:</b> contains 429 FAS files, one for each of the partitioned nucleotide character sets in the concatenated_nt_partition.nex file. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12, followed by multispecies coalescent analysis using ASTRAL v 4.10.5.
本数据集用于重建横脊叶蝉亚科(Deltocephalinae)的系统发育树,分析过程使用IQ-TREE v1.6.12与ASTRAL v4.10.5完成。以下为所用数据文件说明:
**1) taxon_sampling.csv**:包含所有样本的测序标识(第一列)与分类学信息(第二列),测序标识将被用于后续的序列比对文件与分区文件中。
**2) concatenated_nt.phy**:用于IQ-TREE v1.6.12开展横脊叶蝉亚科最大似然分析的合并核苷酸序列比对文件。该文件收录了730个样本中429个基因的163365个核苷酸位点序列,以连字符(-)表示序列间隙。
**3) concatenated_nt_partition.nex**:合并核苷酸序列比对的分区配置文件。该文件将163365个核苷酸特征划分为429个特征集,并为每个特征集指定最优替换模型。
**4) concatenated_aa.phy**:用于IQ-TREE v1.6.12开展横脊叶蝉亚科最大似然分析的合并氨基酸序列比对文件。该文件收录了730个样本中429个基因的53969个氨基酸位点序列,以连字符(-)表示序列间隙。
**5) concatenated_aa_partition.nex**:合并氨基酸序列比对的分区配置文件。该文件将53969个氨基酸特征划分为429个特征集,并为每个特征集指定最优替换模型。
**6) concatenated_nt_106taxa.phy**:简化版合并核苷酸序列比对数据集,对应107个样本×86个基因的序列矩阵。该比对文件用于借助PAML v4.9中的MCMCTree工具估算横脊叶蝉亚科的分歧时间,收录了107个样本中86个基因的79239个核苷酸位点序列,以连字符(-)表示序列间隙。
**7) concatenated_nt_106taxa_partition.nex**:对应concatenated_nt_106taxa.phy的核苷酸序列比对分区配置文件。该文件将79239个核苷酸特征划分为86个特征集,并为每个特征集指定最优替换模型。
**8) individual_gene_alignment.zip**:压缩包内含429个FAS格式文件,分别对应concatenated_nt_partition.nex文件中的429个分区核苷酸特征集,以连字符(-)表示序列间隙。该文件集被用于通过IQ-TREE v1.6.12构建单基因树,后续再通过ASTRAL v4.10.5开展多物种溯祖分析。
提供机构:
University of Illinois at Urbana-Champaign
创建时间:
2022-03-01



