five

CleanCodeReview

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13828618
下载链接
链接失效反馈
官方服务:
资源简介:
To investigate the problem of classifying source code reviews, we have created a dataset suitable for evaluating and testing various methods for solving this problem. We combined four open datasets, and manually marked up 3200 code review comments. We have created our own classification based on the available datasets. The final dataset contains 10045 comments and is divided into 16 classes and 5 groups for hierarchical classification.   The dataset contains comment classes such as: Style - readability, code layout, indentation issues, and other common programming conventions Naming - uniform style and convenience of naming variables, methods, classes Questioning - questions to the author of the code, requests clarification of the code or examples of use Response - appointment of other reviewers, writes acknowledgements, agreements with others, additions to the developer's opinion Convention - discussion of the software development process Testing - requests tests to verify the functionality of the code Design - architecture and code design, program structure control Refactoring - logical structure, object creation, logical errors Functionality - identification of code defects Roadmap - further development of the program Optimization - code optimization, parallelism, synchronization Error - identifies problems with exception and error handling Documentation - problems with documentation or comments in the source code Support - compatibility with other systems, support systems Input/Output - input/output in the graphical user interface, problems with pop-up windows. Other - comments that do not carry a semantic load without context   Union of classes: Code style (Style, Naming) Discussion (Questioning, Response, Convention, Testing) Development (Design, Refactoring, Functionality, Roadmap, Optimization, Error) User (Documentation, Support, Input/Output) Other (Other)
创建时间:
2024-09-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作