five

An Annotated Dataset of Stack Overflow Post Edits

收藏
arXiv2020-05-06 更新2024-06-21 收录
下载链接:
https://doi.org/10.5281/zenodo.3754159
下载链接
链接失效反馈
官方服务:
资源简介:
本数据集名为‘An Annotated Dataset of Stack Overflow Post Edits’,由阿德莱德大学创建,包含超过745万条Stack Overflow帖子的编辑记录。数据集涵盖了代码和文本的编辑,不仅限于特定编程语言。创建过程中,研究者利用了SOTorrent数据集,并通过手动分析和定制的正则表达式对编辑信息进行了分类。该数据集主要用于软件工程领域,特别是用于挖掘细粒度的代码和文本补丁,以优化非功能性软件属性,如运行效率。

This dataset, titled *An Annotated Dataset of Stack Overflow Post Edits*, was developed by the University of Adelaide. It contains over 7.45 million edit records of Stack Overflow posts, covering edits to both code and text without being restricted to specific programming languages. During the dataset construction, researchers leveraged the SOTorrent dataset and categorized the edit information via manual analysis and custom regular expressions. This dataset is primarily applied in the field of software engineering, especially for mining fine-grained code and text patches to optimize non-functional software attributes such as runtime efficiency.
提供机构:
阿德莱德大学
创建时间:
2020-04-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作