five

Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition

收藏
osf.io2023-08-09 更新2025-01-15 收录
下载链接:
https://osf.io/wn8fd
下载链接
链接失效反馈
官方服务:
资源简介:
Access to data is a critical feature of an efficient, progressive and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data (‘analytic reproducibility’). To investigate this, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), although not all data appeared reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). For 35 of the articles determined to have reusable data, we attempted to reproduce 1324 target values. Ultimately, 64 values could not be reproduced within a 10% margin of error. For 22 articles all target values were reproduced, but 11 of these required author assistance. For 13 articles at least one value could not be reproduced despite author assistance. Importantly, there were no clear indications that original conclusions were seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.

数据获取是构建高效、进步且最终能自我修正的科学生态系统的关键特征。然而,在实践中,数据共享的潜在益处得以实现的程度尚不明确。至关重要的是,目前尚不清楚已发表的发现是否可以通过对共享数据进行重复分析得以验证(所谓‘分析可重复性’)。为此,我们对《认知》期刊实施的一项强制性开放数据政策进行了观察性评估。中断时间序列分析显示,政策实施后,可用数据声明显著增加(104/417,政策前25%,政策后78%),尽管并非所有数据均具有可重用性(政策前22%,政策后62%)。对于35篇确定具有可重用数据的文章,我们尝试重现1324个目标值。最终,在10%的误差范围内,有64个值无法重现。对于22篇文章,所有目标值均得以重现,但其中11篇文章需要作者协助。对于13篇文章,即便在作者协助下,至少有一个值无法重现。值得注意的是,没有明确的迹象表明原始结论受到严重影响。强制性开放数据政策可以提升数据共享的频率和质量。然而,数据整理不佳、分析规范不明确以及报告错误等因素可能阻碍分析可重复性,从而削弱数据共享的效用和科学发现的可信度。
提供机构:
Center For Open Science
二维码
社区交流群
二维码
科研交流群
商业服务