five

Data from: The effect of gene flow on coalescent-based species-tree inference

收藏
Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/4984251
下载链接
链接失效反馈
资源简介:
Most current methods for inferring species-level phylogenies under the coalescent model assume that no gene flow occurs following speciation. Several studies have examined the impact of gene flow (e.g., Eckert and Carstens (2008); Chung and Ane (2011); Leache et al. (2014); Solis-Lemus et al. (2016)) and of ancestral population structure (DeGeorgio and Rosenberg, 2016) on the performance of species-level phylogenetic inference, and analytic results have been proven for network models of gene flow (e.g., Solis-Lemus et al. (2016); Zhu et al. (2016)). However, there are few analytic results for a continuous model of gene flow following speciation, despite the development of mathematical tools that could facilitate such study (e.g., Hobolth et al. (2011); Andersen et al. (2014); Tian and Kubatko (2016)). In this paper, we consider a three-taxon isolation-with-migration model that allows gene flow between sister taxa for a brief period following speciation, as well as variation in the effective population sizes across the species tree. We derive the probabilities of each of the three gene tree topologies under this model, and show that for certain choices of the gene flow and effective population size parameters, anomalous gene trees (i.e., gene trees that are discordant with the species tree but that have higher probability than the gene tree concor- dant with the species tree) exist. We characterize the region of parameter space producing anomalous trees, and show that the probability of the gene tree that is concordant with the species tree can be arbitrarily small. We then show that there is theoretical support for using SVDQuartets with an outgroup to infer the rooted three-taxon species tree in a model of gene flow between sister taxa. We study the performance of SVDQuartets on simulated data and compare it to three other commonly-used methods for species tree inference, AS- TRAL, MP-EST, and concatenation. The simulations show that ASTRAL, MP-EST, and concatenation can be statistically inconsistent when gene flow is present, while SVDQuartets performs well, though large sample sizes may be required for certain parameter choices.
创建时间:
2023-06-28
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4099个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

中国区域交通网络数据集

该数据集包含中国各区域的交通网络信息,包括道路、铁路、航空和水路等多种交通方式的网络结构和连接关系。数据集详细记录了各交通节点的位置、交通线路的类型、长度、容量以及相关的交通流量信息。

data.stats.gov.cn 收录

PCLT20K

PCLT20K数据集是由湖南大学等机构创建的一个大规模PET-CT肺癌肿瘤分割数据集,包含来自605名患者的21,930对PET-CT图像,所有图像都带有高质量的像素级肿瘤区域标注。该数据集旨在促进医学图像分割研究,特别是在PET-CT图像中肺癌肿瘤的分割任务。

arXiv 收录

stanford_cars

该数据集是一个包含多个汽车品牌和型号的图片数据集,每个图片样本都标记有相应的汽车品牌和型号信息。数据集适用于图像识别和分类任务,特别是汽车品牌和型号的识别。

huggingface 收录

Global Firepower Index (GFI)

Global Firepower Index (GFI) 是一个评估全球各国军事力量的综合指数。该指数考虑了超过50个因素,包括军事预算、人口、陆地面积、海军力量、空军力量、自然资源、后勤能力、地理位置等。数据集提供了每个国家的详细评分和排名,帮助分析和比较各国的军事实力。

www.globalfirepower.com 收录

CMNEE(Chinese Military News Event Extraction dataset)

CMNEE(Chinese Military News Event Extraction dataset)是国防科技大学、东南大学和清华大学联合构建的一个大规模的、基于文档标注的开源中文军事新闻事件抽取数据集。该数据集包含17,000份文档和29,223个事件,所有事件均基于预定义的军事领域模式人工标注,包括8种事件类型和11种论元角色。数据集构建遵循两阶段多轮次标注策略,首先通过权威网站获取军事新闻文本并预处理,然后依据触发词字典进行预标注,经领域专家审核后形成事件模式。随后,通过人工分批、迭代标注并持续修正,直至满足既定质量标准。CMNEE作为首个专注于军事领域文档级事件抽取的数据集,对推动相关研究具有显著意义。

github 收录