five

DACOS - Dataset

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7569671
下载链接
链接失效反馈
官方服务:
资源简介:
DACOS - DAtaset of COde Smells   The dataset offers annotated code snippets for three code smells— multifaceted abstraction, complex method, and long parameter list. In addition to a manually annotated dataset on potentially subjective snippets, we offer a larger set of snippets containing the snippets that are either definitely benign or smelly. The upload contains three files : DACOSMain.sql - This is the SQL file containing the main DACOS dataset.  DACOSExtended.sql - This is the SQL file containing the Extended DACOS dataset.  Files.zip - The zip file containing all the source code files.  Required Software The dataset is created in MySQL. Hence a local or remote installation of MySQL is needed with privileges to create and modify schemas. Importing the Dataset The dataset is a self-contained SQL file. To import the dataset, run the following command:   mysql -u username -p database_name < DACOSMain.sql mysql -u username -p database_name < DACOSExtended.sql   Understanding the Datasets Both the datasets differ in architecture. The main dataset contains a table named annotations that contains every annotation collected from users. The sample table contains the samples presented to the user for annotation. The class_metrics and method_metrics contain the tables for class and method metrics respectively. These were used to filter samples that are likely to contain smells and hence can be shown to users.  The extended dataset is created by selecting samples that are below or above the selected metric range for each smell. Hence, these samples are definitely smelly or benign. The extended version of the dataset does not contain a table for annotation since they were not presented to user. It instead has an 'entry' table where each sample is classified according to the smell it contains. The codes for identifying smells are as below: Condition smell Id Multifaceted Abstraction Present 1 Multifaceted Abstraction not detected 4 Long Parameter List Present 2 Long Parameter List Absent 5 Complex Method Present 3 Complex Method Absent 6
创建时间:
2023-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作