five

Data for: PickMe: Sample selection for species tree reconstruction using coalescent weighted quartets

收藏
DataONE2025-02-11 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:e989b86eeace2a3a8c8c21d08834ae4df9385389adcad6d4559c75109d69f4bc
下载链接
链接失效反馈
官方服务:
资源简介:
After collecting large data sets for phylogenomics studies, researchers must decide which genes or samples to include when reconstructing a species tree. Incomplete or unreliable data sets make the empiricist's decision more difficult. Researchers rely on ad hoc strategies to maximize sampling while ensuring sufficient data for accurate inferences. An algorithm called PickMe formalizes the sample selection process, assuming that the samples evolved under the Tree Multispecies Coalescent model. We propose a Bayesian framework for selecting samples for species tree analysis. Given a collection of gene trees, we compute a posterior probability for each quartet, describing the likelihood that the species tree displays this topology. From this, we assign individual samples reliability scores computed as the average of a scaled version of the posterior probabilities. PickMe uses these weights to recommend which samples to include in a species tree analysis. Analysis of simulated data showed t..., We obtained targeted sequence data for 763 putatively single-copy nuclear loci for samples of 59 North American milkweed species, three African outgroup species, \textit{Asclepias physocarpa}, \textit{A. fruticosa}, and \textit{A. fornicata}, and one additional outgroup, \textit{Pergularia daemia} using the target enrichment baits of Weitemier et al. (2014) (Supplemental Material~\protect\ref{app:milkweed}). Data for 32 of these samples and orthologs from the genome sequence of \textit{Asclepias syriaca} \citep{weitemier2019draft} were included in the analyses of \cite{BOUTTE2019106534}, and nuclear sequence data for the additional 30 samples were generated using the DNA sequencing and assembly methods described therein. \cite{BOUTTE2019106534} had excluded the 30 newly analyzed samples based on an ad hoc minimum gene recovery criterion of 600 genes (79\%) with the goal of high gene occupancy for all samples for species tree analyses. For the analyses conducted here, we masked assembled..., , # Data for: PickMe: sample selection for species tree reconstruction using coalescent weighted quartets [https://doi.org/10.5061/dryad.3r2280ggv](https://doi.org/10.5061/dryad.3r2280ggv) ## Description of the data and file structure Data was collected for the analysis of the evolutionary relationships among milkweeds.  The remaining data was used to test the PickMe algorithm for sample selection in the context of phylogenomic analysis. **Data Descriptions** \- **Milkweed-Sequence-Files.zip**: Contains sequence data for the analysis. By the time of publication, all sequences will be referenced on GenBank. \- **estimated-gene-trees-NJ-Uncorrected** and **estimated-gene-trees-RAxML ** **estimated-gene-trees-NJ-Uncorrected**: Contain all estimated Milkweed gene trees as described in the associated article.  Sample names were cleaned up for the main manuscript.  A log for matching is listed in a text file. \- **OldSpeciesTree.cf.tree**: The species tree referenced ...
创建时间:
2025-02-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作