Data from: Chromosome-scale inference of hybrid speciation and admixture with convolutional neural networks
收藏DataCite Commons2026-03-12 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.63xsj3v0r
下载链接
链接失效反馈官方服务:
资源简介:
Inferring the frequency and mode of hybridization among closely related
organisms is an important step for understanding the process of
speciation and can help to uncover reticulated patterns of
phylogeny more generally. Phylogenomic methods to test for the presence of
hybridization come in many varieties and typically operate by
leveraging expected patterns of genealogical discordance in the absence of
hybridization. An important assumption made by these tests is that the
data (genes or SNPs) are independent given the species tree. However, when
the data are closely linked, it is especially important to consider their
non-independence. Recently, deep learning techniques such as convolutional
neural networks (CNNs) have been used to perform population genetic
inferences with linked SNPs coded as binary images. Here we use CNNs for
selecting among candidate hybridization scenarios using the tree topology
(((P1,P2),P3),Out) and a matrix of pairwise nucleotide divergence
(dXY) calculated in windows across the genome. Using coalescent
simulations to train and independently test a neural network showed that
our method, HyDe-CNN, was able to accurately perform model selection for
hybridization scenarios across a wide-breath of parameter space. We then
used HyDe-CNN to test models of admixture in
Heliconius butterflies, as well as comparing it to a random
forest classifier trained on introgression-based statistics. Given the
flexibility of our approach, the dropping cost of long-read sequencing,
and the continued improvement of CNN architectures, we anticipate that
inferences of hybridization using deep learning methods like ours will
help researchers to better understand patterns of admixture in their study
organisms.
提供机构:
Dryad
创建时间:
2020-08-06



