Data for: Estimating causal effects with machine learning: A guide for ecologists
收藏DataONE2025-10-17 更新2025-10-25 收录
下载链接:
https://search.dataone.org/view/sha256:24e830ed979b4e343cc59332c12eb3374497dfc5b0987451740cbd1c6226c73e
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the R code and data (simulated and empirical) used for the manuscript âEstimating Causal Effects with Machine Learning: A Guide for Ecologists.â It provides reproducible examples demonstrating the application of four causal machine learning methods.
The dataset includes:
Simulated data generated in R to estimate the causal effect of honeybee abundance on wild bee populations. Variables include environmental covariates (e.g., soil, climate, topography), confounders (e.g., pollinated agriculture), an instrumental variable (beekeeping policy), and outcome measures (wild bee abundance), and include a mixture of linear, nonlinear, and interactions.
Empirical data and example scripts illustrating the use of Causal Forests to assess heterogeneous effects of depth on Laminaria digitata abundance across Atlantic Canada, incorporating geographic (latitude, longitude) and biotic (invasive bryozoan) covariates.
Annotated R scripts implementing DML, TML..., , # Data for: Estimating causal effects with machine learning: A guide for ecologists
Dataset DOI: [10.5061/dryad.mw6m90694](https://doi.org/10.5061/dryad.mw6m90694)
## Description of the data and file structure
Simulated and empirical data and R code associated with: Estimating causal effects with machine learning: A guide for ecologists
### Files and variables
#### File: case_study_data.csv
**Description:**Â Variables included in the *Laminaria digitata* dataset used for the causal forest case study.
##### Variables
* laminaria_digitata: abundance (percent cover) of *Laminaria digitata* at each sampling site
* depth:Â depth (in meters) at each sampling siteÂ
* lat:Â latitude (decimal degrees) of each sampling site
* lon:Â longitude (decimal degrees) of each sampling site
* Membranipora membranacea: abundance (percent cover) of *Membranipora membranacea *at each sampling site
#### File: MEE_SI.R
**Description:**Â R script containing all code associated with the manuscript *âEstimati...,
本仓库包含了用于论文《利用机器学习估算因果效应:生态学家指南》(Estimating Causal Effects with Machine Learning: A Guide for Ecologists)的R代码与数据集(含模拟数据与实测数据),并提供可复现的示例,展示四种因果机器学习方法的应用流程。
本数据集包含以下内容:
1. 基于R生成的模拟数据,用于估算蜜蜂种群丰度对野生蜂种群的因果效应。该数据集涵盖环境协变量(如土壤、气候、地形)、混淆变量(如授粉农田)、工具变量(养蜂政策)与结果变量(野生蜂种群丰度),且包含线性、非线性及交互效应的混合场景。
2. 实测数据与示例脚本,用于演示因果森林(Causal Forests)的应用,以评估加拿大大西洋沿岸海域中,水深对翅藻(Laminaria digitata)种群丰度的异质性效应,分析中纳入了地理协变量(纬度、经度)与生物协变量(入侵苔藓虫)。
带注释的R脚本实现了双机器学习(Double Machine Learning, DML)、目标最大似然估计(Targeted Maximum Likelihood, TML)等方法,配套说明详见:# 数据集配套:《利用机器学习估算因果效应:生态学家指南》
数据集DOI:[10.5061/dryad.mw6m90694](https://doi.org/10.5061/dryad.mw6m90694)
## 数据与文件结构说明
本数据集与配套R代码源自论文《利用机器学习估算因果效应:生态学家指南》
### 文件与变量
#### 文件:case_study_data.csv
**说明**:本文件为因果森林案例研究所用的翅藻(Laminaria digitata)数据集,包含以下变量:
##### 变量列表
* `laminaria_digitata`:各采样点翅藻(*Laminaria digitata*)的丰度(以盖度百分比计)
* `depth`:各采样点的水深(单位:米)
* `lat`:各采样点的纬度(以十进制度数计)
* `lon`:各采样点的经度(以十进制度数计)
* `Membranipora membranacea`:各采样点膜孔苔虫(*Membranipora membranacea*)的丰度(以盖度百分比计)
#### 文件:MEE_SI.R
**说明**:包含论文配套全部代码的R脚本(原文标题截断为Estimati...)
创建时间:
2025-10-18



