five

Effects of Ignoring Survey Design Information for Data Reuse

收藏
Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/effects-ignoring-survey-data-reuse/2962093
下载链接
链接失效反馈
官方服务:
资源简介:
Data is currently being used, and reused, in ecological research at unprecedented rates. To ensure appropriate reuse however, we need to ask the question: “Are aggregated databases currently providing the right information to enable effective and unbiased reuse?” We investigate this question, with a focus on designs that purposefully bias the selection of sampling locations (upweighting the probability of selection of some locations). These designs are common and examples are those that have unequal inclusion probabilities or are stratified. We perform a simulation experiment by creating datasets with progressively more bias, and examine the resulting statistical estimates. The effect of ignoring the survey design can be profound, with biases of up to 250% when naive analytical methods are used. The bias is not reduced by adding more data. Fortunately, the bias can be mitigated by using an appropriate estimator or an appropriate model. These are only applicable however, when essential information about the survey design is available: the randomisation structure (e.g. inclusion probabilities or stratification), and/or covariates used in the randomisation process. The results suggest that such information must be stored and served with the data to support inference and reuse. Citation: S.D. Foster, J. Vanhatalo, V.M. Trenkel, T. Schulz, E. Lawrence, R. Przeslawski, and G.R. Hosack. 2021. Effects of ignoring survey design information for data reuse. Ecological Applications 31(6): e02360. 10.1002/eap.2360
提供机构:
Australian Ocean Data Network
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作