CCD
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/cogito2012/carcrashdataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从使用机器学习库的GitHub项目中筛选出的标记为机器学习(ML)问题的议题。它仅包括那些标题中包含ML关键词,并且修复内容是在使用机器学习库的脚本背景下的议题。此外,该数据集是经过手动整理和筛选的,使用了ML关键词,并且包含了与非ML议题的比较分析。该数据集的数据来源于根据新颖性、流行度和活跃度等标准选取的机器学习项目。其任务是对ML议题与非ML议题的解决时间和修复大小进行比较分析。
This dataset comprises issues labeled as machine learning (ML) problems, filtered from GitHub projects employing machine learning libraries. It exclusively includes issues with ML-related keywords in their titles, and whose fixes are made in the context of scripts using machine learning libraries. Additionally, this dataset is manually curated and filtered using ML keywords, and incorporates comparative analyses against non-ML issues. The dataset is sourced from machine learning projects selected based on criteria including novelty, popularity, and activity level. Its core task is to perform a comparative analysis of the resolution time and fix size between ML issues and non-ML issues.
提供机构:
GitHub



