教育背景下软件工程团队合作评估数据集
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26022.html
下载链接
链接失效反馈官方服务:
资源简介:
Prof. D. Petkovic (SFSU) Petkovic '@' sfsu.edu; Prof. Rainer Todtenhoefer (Fulda University, Germany); Prof. Shihong Huang (FAU) Data Set Information: The data can be used to try to predict student learning in SE teamwork based on observation of their team activity **** README FILE from the submitted data ZIP **** # San Francisco State University # Software Engineering Team Assessment and Prediction (SETAP) Project # Machine Learning Training Data File Version 0.7 # ==================================================================== # # Copyright 2000-2017 by San Francisco State University, Dragutin # Petkovic, and Marc Sosnick-Perez. # # CONTACT # ------- # Professor Dragutin Petkovic: petkovic '@' sfsu.edu # # LICENSE # ------- # This data is released under the Creative Commons Attribution- # NonCommercial 4.0 International license. For more information, # please see # [Web link]. # # The research that has made this data possible has been funded in # part by NSF grant NSF-TUES1140172. # # YOUR FEEDBACK IS WELCOME # ------------------------ # We are interested in how this data is being used. If you use it in # a research project, we would like to know how you are using the # data. Please contact us at petkovic '@' sfsu.edu. # # # FILES INCLUDED IN DISTRIBUTION PACKAGE # ====================================== # This archive contains the data collected by the SETAP Project. # # # More data about the SETAP project, data collection, and description # and use of machine learning to analyze the data can be found in the # following paper: # # D. Petkovic, M. Sosnick-Perez, K. Okada, R. Todtenhoefer, S. Huang, # N. Miglani, A. Vigil: 'Using the Random Forest Classifier to Assess # and Predict Student Learning of Software Engineering Teamwork'. # Frontiers in Education FIE 2016, Erie, PA, 2016 # # # # See DATA DEscriptION below for more information about the data. The # README file (which you are reading) contains project information # such as data collection techniques, data organization and field # naming convention. In addition to the README file, the archive # contains a number of .csv files. Each of these CSV files contains # data aggregated by team from the project (see below), paired with # that team's outcome for either the process or product component of # the team's evaluation. The files are named using the following # convention: # # setap[Process|Product]T[1-11].csv # # For example, the file setapProcessT5.csv contains the data for all # teams for time interval 5, paired with the outcome data for the # Process component of the team's evaluation. # # Detailed information about the exact format of the .csv file may be # found in the csv files themselves. # # # DATA DEscriptION # ==================================================================== # The following is a detailed description of the data contained in the # accompanying files. # # INTRODUCTION # ------------ # # The data contained in these files were collected over a period of # several semesters from students engaged in software engineering # classes at San Francisco State University (class sections of CSC # 640, CSC 648 and CSC 848). All students consented to this data # being shared for research purposes provided no uniquely identifiable # information was contained in the distributed files. The information # was collected through various means, with emphasis being placed on # the collection of objective, quantifiable information. For more # information on the data collection procedures, please see the paper # referenced above. # # # PRIVACY # ------- # The data contained in this file does not contain any information # which may be individually traced to a particular student who # participated in the study. # # # BRIEF DEscriptION OF DATA SOURCES AND DERIVATIONS # ------------------------------------------------- # SAMs (Student Activity Measure) are collected for each student team # member during their participation in a software engineering class. # Student teams work together on a final class project, and comprise # 5-6 students. Teams that are made up of students from only one # school are labeled local teams. Teams made up of students from more # than one school are labeled global teams. SAMs are collected from: # weekly timecards, instructor observations, and software engineering # tool usage logs. SAMs are then aggregated by team and time interval # (see next section) into TAMs (Team Activity Measure). Outcomes are # determined at the end of the semester through evaluation of student # team work in two categories: software engineering process (how well # the team applied best software engineering practices), and software # engineering product (the quality of the finished product the team # produced). Thus for each team, two outcomes are determined, process # and product, respectively. Outcomes are classified into two class # grades, A or F. A represents teams that are at or above # expectations, F represents teams that are below expectations or need # attention. For more information, please see the paper referenced # above. # # The SE process and SE product outcomes represent ML training classes # and are to be considered separately, e.g. one should train ML for SE # process separately from training for SE product. # # TIME INTERVALS FOR WHICH DATA IS COLLECTED # ------------------------------------------ # Data collected continuously throughout the semester are aggregated # into different time intervals for the semester's project reflecting # different dynamics of teamwork during the class. Time intervals # represent time periods in which a milestone was developed by each # team. A milestone represents a major deliverable point in the class # for all student teams. The milestones are roughly divided into the # following topics: # # M1 - high level requirements and specs # M2 - more detailed requirements and specs # M3 - first prototype # M4 - beta release # M5 - final delivery # # Time intervals are combinations of the time in which milestones are # being produced. Time intervals are used in research only. # # In addition to time intervals corresponding to milestones, a number # of time intervals combining multiple T1-T5 time intervals have been # calculated. This was done to group student activities into design # vs. implementation phases which have different dynamics. # # These time intervals are defined as follows: # # Time Interval Corresponding Milestone Periods in Class # ----------------- -------------------------------------------- # 0 Milestone 0 # 1 Milestone 1 # 2 Milestone 2 # 3 Milestone 3 # 4 Milestone 4 # 5 Milestone 5 # 6 Milestone 1 - Milestone 2 inclusive # 7 Milestone 1 - Milestone 3 inclusive # 8 Milestone 1 - Milestone 4 inclusive # 9 Milestone 1 - Milestone 5 inclusive # 10 Milestone 4 - Milestone 5 inclusive # 11 Milestone 3 - Milestone 5 inclusive # # # # SETAP PROJECT OVERALL DATA STATISTICS # ===========================================================
提供机构:
帕依提提



