An Annotated Dataset of Stack Overflow Post Edits
收藏arXiv2020-05-06 更新2024-06-21 收录
下载链接:
https://doi.org/10.5281/zenodo.3754159
下载链接
链接失效反馈官方服务:
资源简介:
本数据集名为‘An Annotated Dataset of Stack Overflow Post Edits’,由阿德莱德大学创建,包含超过745万条Stack Overflow帖子的编辑记录。数据集涵盖了代码和文本的编辑,不仅限于特定编程语言。创建过程中,研究者利用了SOTorrent数据集,并通过手动分析和定制的正则表达式对编辑信息进行了分类。该数据集主要用于软件工程领域,特别是用于挖掘细粒度的代码和文本补丁,以优化非功能性软件属性,如运行效率。
This dataset, titled *An Annotated Dataset of Stack Overflow Post Edits*, was developed by the University of Adelaide. It contains over 7.45 million edit records of Stack Overflow posts, covering edits to both code and text without being restricted to specific programming languages. During the dataset construction, researchers leveraged the SOTorrent dataset and categorized the edit information via manual analysis and custom regular expressions. This dataset is primarily applied in the field of software engineering, especially for mining fine-grained code and text patches to optimize non-functional software attributes such as runtime efficiency.
提供机构:
阿德莱德大学
创建时间:
2020-04-17



