five

Data sets of Characterising the Knowledge about Primitive Variables in Java Code Comments

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4626387
下载链接
链接失效反馈
官方服务:
资源简介:
We provide two data sets used in the paper “Characterising the Knowledge about Primitive Variables in Java Code Comments”.  The first data set (Annotated_Evaluation dataset.txt) was used to answer RQ1-2, and the second data set (Large-scale dataset.csv) was used to answer RQ3-5. A) Brief description to read Annotated_Evaluation dataset.txt file:  Column headers in the file are self-explanatory; however, we provide a brief description of the following column headers for ease of reading: 1) Project name: The project name as appeared in the GitHub 2) GitHub URL: GitHub URL to the location of the variable in source code. 3) Commit ID: Last commit. 4) Variable name: The variable identifier. 5) Access specifier: The instance variable (field variable) have one of these values: default, private, public and protected, while the non-field variable has the "[n/a]" value. 6) Keyword (Final): Depends on the scope of the variable and can be either “final” or “[n/a]”. 7) Keyword (Static): Can be either “static” for a field variable or “[n/a]” for local and parameter’s variables. 8) Variable type: The variable type can have one of these types: int, byte, char, short, long, float, double, Boolean and String. 9) Scope: The scope of the variable in which it declared and can have one of these values: instance variable, local variable (constructor), local variable (function), parameter (function), parameter (constructor). 10) Inline, line or block comments: The text of the comment accompanying the variable identifier or "[n/a]" for an uncommented variable. 11) Comment type: The comment type depends on the style of documentation: line {inline, line}, block, and block format. 12) Comment format: Has two values: Javadoc for comment of type of block and [n/a] for line and block format. 13) Columns P – U: Show the results of the four matching techniques. The "TRUE" value indicates a particular technique can detect the identifier in an accompanying comment. 14) Columns V – AG: Annotators' results for each annotation question (AQ) illustrated in TABLE IV in the paper. Abbreviations used: 1) init: indicates the initial annotation between pair of annotators, 2) !NE: conflict detected between pair of annotators and 3) CR: conflict resolved between pair of annotators. 15) Columns AJ to AQ: Data annotated by annotator 1 16) Columns AR to AY: Data annotated by annotator 2 17) Columns AZ to BG: Data annotated by annotator 3 18) Columns AH to BO: Data annotated by annotator 4   B) Large-scale dataset.csv file:  Please follow the explanation provided from 1 to 10 and illustrated in (A) above.   To cite the dataset, please use the following paper: Mahfouth Alghamdi, Shinpei Hayashi, Takashi Kobayashi, and Christoph Treude, “Characterising the Knowledge about Primitive Variables in Java Code Comments,” in Proceedings of the 18th International Conference on Mining Software Repositories, 2021.
创建时间:
2021-03-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作