five

Data for: PickMe: Sample selection for species tree reconstruction using coalescent weighted quartets

收藏
DataONE2025-06-23 更新2025-06-28 收录
下载链接:
https://search.dataone.org/view/sha256:91bebbf49a28c66860be411268be21a2f5c9fd6fe073b2820e2dfbc6b621551a
下载链接
链接失效反馈
官方服务:
资源简介:
After collecting large data sets for phylogenomics studies, researchers must decide which genes or samples to include when reconstructing a species tree. Incomplete or unreliable data sets make the empiricist's decision more difficult. Researchers rely on ad hoc strategies to maximize sampling while ensuring sufficient data for accurate inferences. An algorithm called PickMe formalizes the sample selection process, assuming that the samples evolved under the Tree Multispecies Coalescent model. We propose a Bayesian framework for selecting samples for species tree analysis. Given a collection of gene trees, we compute a posterior probability for each quartet, describing the likelihood that the species tree displays this topology. From this, we assign individual samples reliability scores computed as the average of a scaled version of the posterior probabilities. PickMe uses these weights to recommend which samples to include in a species tree analysis. Analysis of simulated data showed t..., We obtained targeted sequence data for 763 putatively single-copy nuclear loci for samples of 59 North American milkweed species, three African outgroup species, \textit{Asclepias physocarpa}, \textit{A. fruticosa}, and \textit{A. fornicata}, and one additional outgroup, \textit{Pergularia daemia} using the target enrichment baits of Weitemier et al. (2014) (Supplemental Material~\protect\ref{app:milkweed}). Data for 32 of these samples and orthologs from the genome sequence of \textit{Asclepias syriaca} \citep{weitemier2019draft} were included in the analyses of \cite{BOUTTE2019106534}, and nuclear sequence data for the additional 30 samples were generated using the DNA sequencing and assembly methods described therein. \cite{BOUTTE2019106534} had excluded the 30 newly analyzed samples based on an ad hoc minimum gene recovery criterion of 600 genes (79\%) with the goal of high gene occupancy for all samples for species tree analyses. For the analyses conducted here, we masked assembled..., , # Data for: PickMe: sample selection for species tree reconstruction using coalescent weighted quartets [https://doi.org/10.5061/dryad.3r2280ggv](https://doi.org/10.5061/dryad.3r2280ggv) ## Description of the data and file structure Data was collected for the analysis of the evolutionary relationships among milkweeds.  The remaining data was used to test the PickMe algorithm for sample selection in the context of phylogenomic analysis. **Supplemental Materials:**  -**SupplA1A7.pdf:** Contains PDF of supplemental appendices A1-A7 as referenced in the published article. **Data Descriptions** \- **Milkweed-Sequence-Files.zip**: Contains sequence data for the analysis. All sequences have been referenced on GenBank. \- **estimated-gene-trees-NJ-Uncorrected** and **estimated-gene-trees-RAxML ** **estimated-gene-trees-NJ-Uncorrected**: Contain all estimated Milkweed gene trees as described in the associated article.  Sample names were cleaned up for the main manuscript.  A log for matc...,
创建时间:
2025-06-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作