Survey of software engineering in code used in published papers

NIAID Data Ecosystem2026-03-12 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.k6djh9w5h

下载链接

链接失效反馈

官方服务：

资源简介：

Background: Computer code underpins modern science, and at the present time has a crucial role in leading our response to the COVID-19 pandemic. While models are routinely criticised for their assumptions, the algorithms and the quality of code implementing them often avoid scrutiny and, hence, scientific conclusions cannot be rigorously justified. Problem: Assumptions in programs are hard to scrutinise as they are rarely explicit in published work. In addition, both algorithms and code have bugs, effectively unknown assumptions that have unwanted effects. Code is fallible. Any model interpretation that relies on code is therefore fallible, and if the code is not published with adequate documentation, the code cannot be scrutinised. In turn, the scientific claims cannot be properly scrutinised. Solutions: Code can be made much more reliable using software engineering good practice. Three specific solutions are proposed. First, professional software engineers can help and should be involved in critical research. Secondly, “Software Engineering Boards” (supplementing and analogous to Ethics or Institutional Review Boards) must be instigated and used. Thirdly, code, when used, must be considered an intrinsic part of any publication, and therefore must be formally reviewed by competent software engineers. The paper’s Supplementary Material includes a summary of professional software engineering best practice, particularly as applied to scientific research and publication. Methods The web sites of journals were accessed for their current (July 2020, double-checked January 2021) issues. Peer reviewed research article (i.e., excluding commentary, letters, etc) were selected on the basis that their titles implied code had been used in the published research. A list of full citations, including DOIs, is provided in the supplementary material. The associated articles (PDFs) were read by the author, and all links to supplementary material examined. Where possible, code from the links was downloaded and assessed. In addition, the code policies, where available, of the relevant journals (at the time of the survey) were copied into the supplementary material. The article’s supplementary material has further details of the assessment and the individual results, and is therefore an essential part of this Dryad data. NOTE It is important that this survey is not taken as an evaluation, whether praise or criticism, of any specific paper: any measurement is subject to error, and individual paper assessments are inevitably noisy. Instead, by evaluating a set of diverse papers, the noise will tend to average out. Mean scores, as summarised in the paper, are more reliable measurements.

创建时间：

2021-02-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集