five

200 Annotated Developer Human Errors from GitHub

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10080448
下载链接
链接失效反馈
官方服务:
资源简介:
Software Engineers' Human Errors This dataset contains 200 GitHub comments with manual human error annotations, released as part of the following publication: Benjamin S. Meyers. Human Error Assessment in Software Engineering. Rochester Institute of Technology. 2023. Included Files The "developer_human_errors.csv" file contains the full dataset of 200 software defect descriptions annotated with human error types (slips, lapses, mistakes) and T.H.E.S.E. categories. CSV Fields ID: Unique identifier for the comment. SOURCE: Whether this comment originates from a commit, issue, or pull request. COMMENT_URL: The URL linking to the comment. COMMENT_TEXT: The raw comment text. HUMAN_ERROR_TYPE: Whether the software defect described is a slip, lapse, or mistake. THESE_V4_ID: Manually assigned T.H.E.S.E. category with labels corresponding to Version 4 of T.H.E.S.E. THESE_NAME: Name corresponding to manually assigned T.H.E.S.E. category. Annotation Details Human error types span slips, lapses, and mistakes from James Reason's Generic Error Modelling System (GEMS): Slips: Failures of attention. Lapses: Failures of memory. Mistakes: Failures of planning. T.H.E.S.E. categories are summarized below: S01: Typos & Misspellings S02: Syntax Errors S03: Overlooking documented Information S04: Multitasking Errors S05: Hardware Interaction Errors S06: Overlooking Proposed Code Changes S07: Overlooking Existing Functionality S08: General Attentional Failure L01: Forgetting to Finish a Development Task L02: Forgetting to Fix a Defect L03: Forgetting to Remove Development Artifacts L04: Working with Outdated Source Code L05: Forgetting an Import Statement L06: Forgetting to Save Work L07: Forgetting Previous Development Discussion L08: General Memory Failure M01: Code Logic Errors M02: Incomplete Domain Knowledge M03: Wrong Assumption Errors M04: Internal Communication Errors M05: External Communication Errors M06: Solution Choice Errors M07: Time Management Errors M08: Inadequate Testing M09: Incorrect/Insufficient Configuration M10: Code Complexity Errors M11: Internationalization/String Encoding Errors M12: Inadequate Experience Errors M13: Insufficient Tooling Access Errors M14: Workflow Order Errors M15: General Planning Failure Contact Please contact Benjamin S. Meyers (email) with questions about this data and its collection. Acknowledgments Collection of this data has been sponsored in part by the National Science Foundation (grant 1922169), by the NSA Science of Security Lablet program (grant H98230-17-D-0080/2018-0438-02), and by a Department of Defense DARPA SBIR program (grant 140D63-19-C-0018).
创建时间:
2024-01-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作