five

Data from: The effect of gene flow on coalescent-based species-tree inference

收藏
Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/4984251
下载链接
链接失效反馈
资源简介:
Most current methods for inferring species-level phylogenies under the coalescent model assume that no gene flow occurs following speciation. Several studies have examined the impact of gene flow (e.g., Eckert and Carstens (2008); Chung and Ane (2011); Leache et al. (2014); Solis-Lemus et al. (2016)) and of ancestral population structure (DeGeorgio and Rosenberg, 2016) on the performance of species-level phylogenetic inference, and analytic results have been proven for network models of gene flow (e.g., Solis-Lemus et al. (2016); Zhu et al. (2016)). However, there are few analytic results for a continuous model of gene flow following speciation, despite the development of mathematical tools that could facilitate such study (e.g., Hobolth et al. (2011); Andersen et al. (2014); Tian and Kubatko (2016)). In this paper, we consider a three-taxon isolation-with-migration model that allows gene flow between sister taxa for a brief period following speciation, as well as variation in the effective population sizes across the species tree. We derive the probabilities of each of the three gene tree topologies under this model, and show that for certain choices of the gene flow and effective population size parameters, anomalous gene trees (i.e., gene trees that are discordant with the species tree but that have higher probability than the gene tree concor- dant with the species tree) exist. We characterize the region of parameter space producing anomalous trees, and show that the probability of the gene tree that is concordant with the species tree can be arbitrarily small. We then show that there is theoretical support for using SVDQuartets with an outgroup to infer the rooted three-taxon species tree in a model of gene flow between sister taxa. We study the performance of SVDQuartets on simulated data and compare it to three other commonly-used methods for species tree inference, AS- TRAL, MP-EST, and concatenation. The simulations show that ASTRAL, MP-EST, and concatenation can be statistically inconsistent when gene flow is present, while SVDQuartets performs well, though large sample sizes may be required for certain parameter choices.
创建时间:
2023-06-28
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4099个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

ERIC (Education Resources Information Center)

ERIC (Education Resources Information Center) 是一个广泛的教育文献数据库,包含超过130万条记录,涵盖从1966年至今的教育研究、政策和实践。数据集内容包括教育相关的期刊文章、书籍、研究报告、会议论文、技术报告、政策文件等。

eric.ed.gov 收录

flames-and-smoke-datasets

该仓库总结了多个公开的火焰和烟雾数据集,包括DFS、D-Fire dataset、FASDD、FLAME、BoWFire、VisiFire、fire-smoke-detect-yolov4、Forest Fire等数据集。每个数据集都有详细的描述,包括数据来源、图像数量、标注信息等。

github 收录

THCHS-30

“THCHS30是由清华大学语音与语言技术中心(CSLT)发布的开放式汉语语音数据库。原始录音是2002年在清华大学国家重点实验室的朱晓燕教授的指导下,由王东完成的。清华大学计算机科学系智能与系统,原名“TCMSD”,意思是“清华连续普通话语音数据库”,时隔13年出版,由王东博士发起,并得到了教授的支持。朱小燕。我们希望为语音识别领域的新研究人员提供一个玩具数据库。因此,该数据库对学术用户完全免费。整个软件包包含建立中文语音识别所需的全套语音和语言资源系统。”

OpenDataLab 收录

O*NET

O*NET(Occupational Information Network)是一个综合性的职业信息数据库,提供了关于各种职业的详细描述,包括技能要求、工作活动、知识领域、工作环境等。该数据集被广泛用于职业分析、教育和劳动力市场研究。

www.onetonline.org 收录

UAVDT Dataset

The authors constructed a new UAVDT Dataset focused on complex scenarios with new level challenges. Selected from 10 hours raw videos, about 80, 000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e.g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking.

datasetninja.com 收录