five

Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and KDE

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/400614
下载链接
链接失效反馈
官方服务:
资源简介:
We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.  File Descriptions apache.csv - Apache Defect Rediscovery dataset eclipse.csv - Eclipse Defect Rediscovery dataset kde.csv - KDE Defect Rediscovery dataset   apache.relations.csv - Inter-relations of rediscovered defects of Apache eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse kde.relations.csv - Inter-relations of rediscovered defects of KDE   create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database   neo4j_examples.txt - Sample Neo4j queries mysql_examples.txt - Sample MySQL queries rediscovery_eclipse_6325.png - Output of Neo4j example #1   distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project
创建时间:
2024-08-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作