five

SelfCode 2.0: Annotated Corpus of Student Self-Explanations to Introductory JAVA Programs in Computer Science

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10912668
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset Description: This dataset was collected during a lab study conducted in Spring 2022 for introductory JAVA programming. Students had to provide line by explanations to four JAVA programs in the experimental condition of the study. The JAVA Programs were selected from the examples made available in the PCEX Worked Examples interface. The explanations collected were then split by the number of attempts. Students could attempt twice based on the feedback provided using the the PCEX interface and in their third attempt they filled in the blanks to complete an explanation to the particular line of code. In this dataset, we only have the annotated examples of explanations provided by students. The explanations were annotated on their correctness (binary rating 0 or 1), completeness (binary rating 0 or 1) and similarity (rating scale 1 to 5). Correctness: Given the line of code and context of the line in the program, if the student explanation covers **only** the topics relevant to the line of code Completeness: Given the line of code and context of the line in the program, if the student explanation covers **all** the topics relevant to the line of code Similarity: Given the line of code, the context of the line in the program and an expert explanation to the line of code, the metric compares the similarity on a rating scale from 1 to 5, defined in the following manner: 1 - expert and student explanations are very different, 2 -- expert and student explanations are somewhat alike, but there are major differences in the concepts / topics explained 3 -- expert and student explanations are similar but there are differences in the concepts / topics explained 4 -- expert and student explanations are similar and have few differences in the concepts / topics explained 5 -- expert and student explanations are very similar.   Overall 3000 single attempts (corresponding to 40 student explanation submission) were annotated against different various expert explanation pairs.   Dataset Summary: Explanation Type N DefinitionExperts 2 Source Code Line-by-Line Explanations by ExpertsStudents 60 (annotated 40) Source Code Line-by-Line Explanations by Students COUNT of std_sent_count               std_sent_count 1 2 3 4 5 6 Grand Total 1 1854 367 245 107 34 33 2640 2 222 46 40 12 6 6 332 3 21 5 5 5 1 2 39 4 2 1 2 3     8 Grand Total 2099 419 292 127 41 41 3019   Sample Data: Program: PointTester; Line number: 12; Line code: private int y;Expert1: Every object of the Point class will have its own y-coordinate. Therefore, weneed to declare an instance variable for the class to store the y-coordinate of the point.We declare it as int because we want to have integer coordinates for the point. Notethat an instance variable is a variable defined in a class, for which each instantiatedobject of the class has a separate copy, or instance.Expert2: The instance variables are declared as private to prevent direct access tothem from outside the class. In this way, no unexpected modifications to a Pointobject’s data are possible.Student1: initialize a private value inside the point class with no value yetStudent2: Declares the private int variable y.Student3: Creates a private int that can only be accessed by class Point called int y...Student59: private variable used to store the value entered into the value of the ycoordinate   Kappa Scores: Round Row Numbers Correctness Rating Agreement %age Correctness Rating Kappa Sufficiency Rating Agreement %age Sufficiency Rating Kappa 1 1000 - 1432  92.9 0.365 0.708 -0.0123 2 1432 - 1864 94.2 0.263 77.6 0.329 3 1864 – 1964  75.3 0 70.3 0.299 4 1964 -- 2064  86 0.108 74.7 0.275 5 2064 – 2264 95.5 -0.0158 81.5 0.312 6 2264 – 2464 83.5 0.039 86.5 0.648 7 2464 – 2864 92 0.103 74.5 0.188 8 2864 -- 3005 86.5 -0.026 72.3 0.117     Citation Format:If using this dataset in your project please cite: Lekshmi-Narayanan, A.-B., Chapagain, J., Brusilovsky, P., & Rus, V. (2023). SelfCode 2.0: Annotated Corpus of Student Self-Explanations to Introductory JAVA Programs in Computer Science [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10912669 Acknowledgements:This project was funded as a part of the NSF AWARD # 1822752
创建时间:
2025-03-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作