Data for bioregionalisation and ancestral range estimation in the daisy family

Name: Data for bioregionalisation and ancestral range estimation in the daisy family
Creator: Commonwealth Scientific and Industrial Research Organisation
License: 暂无描述

Research Data Australia2024-12-14 收录

下载链接：

https://researchdata.edu.au/data-bioregionalisation-ancestral-daisy-family/1376460

下载链接

链接失效反馈

官方服务：

资源简介：

This data package contains data underlying the manuscript McDonald-Spicer et al., "Big data for a large clade: bioregionalisation and ancestral range estimation in the daisy family (Asteraceae)" presenting the results of research on the global biogeography and biogeographic history of the Asteraceae. It includes (1) a supermatrix of DNA sequence data obtained from GenBank and BOLD with partitioning information, (2) distribution data at the TDWG level 3, originally based on a database extract of the Global Compositae Checklist but subsequently cleaned and supplemented with additional data for some geographic areas, and (3) input files for ancestral area estimation using the R package BioGeoBEARS.\nLineage: Spatial data\n\nThe spatial data set was based on data extracted from the GCC (compositae.landcareresearch.co.nz, accessed 15 Aug 2014), a database of distribution information for the Asteraceae family. We used OpenRefine (Huynh & Mazzocchi, 2014) to clean the dataset, correcting spelling of taxon names, collapsing varieties and subspecies to species level, and removing hybrids and taxa with distribution listed as ‘null’. Additional distribution information was added for New Zealand, the Cordoba Province in Argentina, Mongolia, South Africa, and Mexico. We removed species from regions where they are non-native. The final spatial dataset used in this study included 27,019 species representing 1,636 genera. All analyses were conducted using the TDWG level 3 of spatial resolution.\n\nPhylogeny\n\nSequences from the nuclear ribosomal Internal Transcribed Spacer region (ITS) and three chloroplast regions (matK, rbcL, and trnL-trnF) were obtained from GenBank and BOLD (Ratnasingham & Hebert, 2007). We used genera as Operational Taxonomic Units and selected a representative sequence for each genus and locus.\n\nGene regions were individually aligned using MAFFT 7 (Katoh & Standley, 2013) and manually edited in Bioedit 7.0.5 (Hall, 1999). The four sequence regions were combined into a supermatrix of 1,273 genera and 9,030 characters. The phylogeny was inferred using RAxML (Stamatakis, 2014) under the GTRCAT model and partitioning by sequence region. The tree was rooted on the Barnadesieae, which are sister to the rest of the family (Funk et al., 2005).\n\nTime calibration\n\nWe time calibrated our phylogeny using Penalized Likelood as implemented in the chronos function of the R package APE (Sanderson, 2002; Paradis et al., 2004; R Core Team, 2016). We set nine calibration points (Appendix S3 of the manuscript). We tested all three implemented clock models (relaxed, correlated, and discrete) and lambda values of 1 and 10. The favoured clock model was discrete with lambda = 10.

提供机构：

Commonwealth Scientific and Industrial Research Organisation