Data and code for: Does the definition of a novel environment affect the ability to detect cryptic genetic variation?

Mendeley Data2024-04-13 更新2024-06-27 收录

下载链接：

https://datadryad.org/stash/dataset/doi:10.5061/dryad.pzgmsbcsc

下载链接

链接失效反馈

官方服务：

资源简介：

# Does the definition of a novel environment affect the ability to detect cryptic genetic variation? --- ### Author: Camille L. Riley This dataset contains the data and code necessary to perform the meta-analysis detailed in Riley et al. 2023; *Does the definition of a novel environment affect the ability to detect cryptic genetic variation?*. Individual README files (.Rmd and .html) are provided, containing code and detailed usage notes. The methodology can be separated into two stages, the generation of effect sizes (effect_size_README.Rmd), and the meta-analysis of the effect sizes (meta_analysis_README.Rmd), involving the statistical analysis and generation of plots included in the article. ## Description of the data and file structure ### Datasets The G-matrix data and associated metadata used in this analysis was sourced from eligible studies retrieved via systematic review. See supplementary table S1 for reference list. **1. G-matrix_data.zip** Zipped file comprised of individual folders labelled with a unique file name corresponding to included articles (e.g. EK009). Where multiple G-matrices were extracted from the same study, the file name ends with a lower-case letter. Within each folder, there are three .csv files; * A_n=._G.csv: G-matrix corresponding to the ancestral environment. "n=" indicates sample size. * N_n=._G.csv: G-matrix corresponding to the novel environmental treatment. "n=" indicates sample size. * Desc_data.csv: Additional descriptive data corresponding to the g-matrix data, comprising the following variables; * trait: phenotypic traits included in the G-matrices, * env_identifier: ancestral or novel environment; * treatment: environmental variable manipulated in experimental treatments, * mean, standard deviation (sd), and standard error (se) of phenotypic trait values, * Va : additive genetic variance, * Nind: number of individuals, * Nfam: number of families, * units: unit of measurement used for trait values * notes: any additional information. **2. Mod_data.csv** A csv file with the following variables: * moderator: variable predicted to affect the strength of the relationship between the dependent and independent variable, * number: number of instances the moderator variable features in the dataset. **3. Phylo_matrix.csv** A matrix of phylogenetic relatedness with variable names corresponding to species names represented in the dataset. **4. SDV_effect_sizes.csv** A csv file with the following variables: * study_code: unique study identifier, * effect_code: unique effect size identifier. Corresponds to names of folders in G-matrix_data.zip, * type: type of matrix, such as G-matrix or P-matrix. All matrices examined in this study are G-matrices, * ES: effect size value, * S_var: sampling variance value, * Precision: precision value, calculated by 1/SE, * Authors: study authors, * Year: year study was published, * type_novel: whether the novel environmental treatment corresponded to an extreme or absolute novel environment, * av_sample_size: average sample size, * design: experimental design of study, whether using broad or narrow sense measures of variance, * trait_no: number of phenotypic traits included in the G-matrix, * Group: taxonomic group of organisms examined, * Species: species name of organism examined, * species_full: species name of organism examined. **5. Taxa_group_data.csv** A csv file with the following variables: * group: taxonomic group, * number: number of instances the moderator variable features in the dataset. ### Code The code required to run this analysis requires R studio, and is designed to run using R version 4.2.3. The necessary R packages to perform the analysis are detailed in the R scripts below. **1. Effect_size_README.Rmd** This markdown file details the code used to generate effect sizes comparing the volume of G between ancestral and novel environments. The required inputs are **G-matrix_data.zip**. **2. Meta_analysis_README.Rmd** This markdown file details the code used to perform the meta-analysis and generate plots comparing the volume of G between ancestral and novel environments. The required inputs are: **SDV_effect_sizes.csv, phylo_matrix.csv, taxa_group_data.csv, mod_data.csv**. **3. Phylogeny.R** This R script contains the R script used to create **phylo_matrix.csv**, a covariance matrix of phylogenetic relatedness for the species represented in the analysis. This is used to account for the effects of phylogeny in meta-regression models.

# 新环境的定义是否会影响隐蔽遗传变异的检测能力？ ### 作者：Camille L. Riley 本数据集包含复现Riley等人2023年发表的论文*Does the definition of a novel environment affect the ability to detect cryptic genetic variation?*中所述元分析所需的全部数据与代码。配套提供独立的README文件（.Rmd与.html格式），其中包含代码及详细使用说明。本研究的方法学可分为两个阶段：效应量（effect size）生成阶段（对应effect_size_README.Rmd），以及效应量元分析（meta-analysis）阶段（对应meta_analysis_README.Rmd），涵盖论文中涉及的统计分析与绘图流程。 ## 数据与文件结构说明 ### 数据集本分析所用的G矩阵（G-matrix）数据及相关元数据，均通过系统综述检索自符合纳入标准的研究，完整参考文献列表见补充表S1。 **1. G-matrix_data.zip** 该压缩包内含多个以唯一文件名命名的独立文件夹，对应纳入的研究（例如EK009）。若同一研究中提取了多组G矩阵，则文件名末尾会添加小写字母以示区分。每个文件夹内包含3个.csv文件： * `A_n=._G.csv`：对应祖先环境的G矩阵，其中`n=`代表样本量。 * `N_n=._G.csv`：对应新环境处理组的G矩阵，其中`n=`代表样本量。 * `Desc_data.csv`：G矩阵数据的补充描述性文件，包含以下变量： * `trait`：G矩阵中纳入的表型性状 * `env_identifier`：环境类型（祖先环境或新环境） * `treatment`：实验处理中操控的环境变量 * `mean`、标准差（sd）与标准误（se）：表型性状值的对应统计量 * `Va`：加性遗传方差 * `Nind`：个体数量 * `Nfam`：家系数量 * `units`：性状值的测量单位 * `notes`：其他补充信息 **2. Mod_data.csv** 该csv文件包含以下变量： * `moderator`：被预测会影响因变量与自变量间关系强度的调节变量 * `number`：该调节变量在数据集中出现的次数 **3. Phylo_matrix.csv** 系统发育亲缘关系矩阵，其变量名与数据集中涉及的物种名称一一对应。 **4. SDV_effect_sizes.csv** 该csv文件包含以下变量： * `study_code`：唯一研究标识符 * `effect_code`：唯一效应量标识符，与`G-matrix_data.zip`内文件夹的名称对应 * `type`：矩阵类型，例如G矩阵或P矩阵（P-matrix）。本研究所有分析的矩阵均为G矩阵 * `ES`：效应量值 * `S_var`：抽样方差值 * `Precision`：精准度值，计算方式为1/标准误（SE） * `Authors`：研究作者 * `Year`：研究发表年份 * `type_novel`：新环境处理组属于极端新环境还是绝对新环境 * `av_sample_size`：平均样本量 * `design`：研究的实验设计，即采用广义还是狭义方差测量方式 * `trait_no`：G矩阵中纳入的表型性状数量 * `Group`：研究所涉及的生物分类群 * `Species`：研究所涉及的物种名称 * `species_full`：研究所涉及的完整物种名称 **5. Taxa_group_data.csv** 该csv文件包含以下变量： * `group`：生物分类群 * `number`：该分类群在数据集中出现的次数 ### 代码本分析所需代码需在R Studio中运行，适配R版本4.2.3。分析所需的R包已在以下脚本中详细说明。 **1. Effect_size_README.Rmd** 该Markdown文件详述了用于生成祖先环境与新环境间G矩阵体积比较的效应量的代码，所需输入文件为**G-matrix_data.zip**。 **2. Meta_analysis_README.Rmd** 该Markdown文件详述了用于执行效应量元分析并生成祖先与新环境间G矩阵体积比较图表的代码，所需输入文件为：**SDV_effect_sizes.csv、phylo_matrix.csv、taxa_group_data.csv、mod_data.csv**。 **3. Phylogeny.R** 该R脚本用于生成**phylo_matrix.csv**，即本分析涉及物种的系统发育亲缘关系协方差矩阵，用于在元回归模型中控制系统发育效应的影响。

创建时间：

2024-04-07