davidgaofc/techdebt
收藏Hugging Face2023-12-04 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/davidgaofc/techdebt
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
- split: validation
path: data/validation-*
dataset_info:
features:
- name: Diff
dtype: string
- name: FaultInducingLabel
dtype: int64
splits:
- name: train
num_bytes: 89390701
num_examples: 207464
- name: test
num_bytes: 29611000
num_examples: 69155
- name: validation
num_bytes: 29496034
num_examples: 69155
download_size: 56932761
dataset_size: 148497735
---
# Dataset Card for TechDebt
This dataset was generated from [The Technical Debt Dataset](https://github.com/clowee/The-Technical-Debt-Dataset) created by Lenarduzzi, et al. and the citation is down below.
## Dataset Details and Structure
The labels for the dataset were provided by the SZZ algorithm cited by the paper and matched to the diff in the commit where the technical debt was located. This diff was then cleaned to only include the lines of code added.
## Bias, Risks, and Limitations
Beware of the data imbalance if you would like to use the dataset. Also, the queries used to extract this data are still being checked over to ensure correctness.
## Recommendations
Changes are constantly being made to this dataset to make it better. Please be aware when you use it.
## References
Valentina Lenarduzzi, Nyyti Saarimäki, Davide Taibi. The Technical Debt Dataset. Proceedings for the 15th Conference on Predictive Models and Data Analytics in Software Engineering. Brazil. 2019.
提供机构:
davidgaofc
原始信息汇总
数据集卡片 for TechDebt
数据集详情和结构
配置
- 默认配置
- 数据文件
- 训练集:路径为
data/train-* - 测试集:路径为
data/test-* - 验证集:路径为
data/validation-*
- 训练集:路径为
- 数据文件
数据集信息
-
特征
- Diff:类型为
string - FaultInducingLabel:类型为
int64
- Diff:类型为
-
拆分
- 训练集
- 字节数:89390701
- 样本数:207464
- 测试集
- 字节数:29611000
- 样本数:69155
- 验证集
- 字节数:29496034
- 样本数:69155
- 训练集
-
下载大小:56932761
-
数据集大小:148497735
偏差、风险和限制
注意数据不平衡问题,提取数据的查询仍在检查中以确保正确性。
建议
数据集正在不断改进中,请在使用时注意。



