chathuranga-jayanath/context-5-predict-token-for-fine-tune-without-comments-from-finmath-0.1
收藏Hugging Face2024-01-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/chathuranga-jayanath/context-5-predict-token-for-fine-tune-without-comments-from-finmath-0.1
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: int64
- name: filepath
dtype: string
- name: start_bug_line
dtype: int64
- name: end_bug_line
dtype: int64
- name: bug
dtype: string
- name: fix
dtype: string
- name: ctx
dtype: string
splits:
- name: train
num_bytes: 15408906
num_examples: 15574
- name: validation
num_bytes: 1921798
num_examples: 1946
- name: test
num_bytes: 1911003
num_examples: 1946
download_size: 6808492
dataset_size: 19241707
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
- split: test
path: data/test-*
---
This dataset is designed for software bug localization and fixing tasks. It includes several features such as a unique identifier (id), file path (filepath), start and end lines of the bug (start_bug_line and end_bug_line), bug description (bug), fix description (fix), and context information (ctx). The dataset is divided into training, validation, and test sets, containing 15574, 1946, and 1946 samples respectively. The download size of the dataset is 6808492 bytes, and the actual size is 19241707 bytes.
提供机构:
chathuranga-jayanath
原始信息汇总
数据集概述
数据特征
- id: 数据类型为
int64 - filepath: 数据类型为
string - start_bug_line: 数据类型为
int64 - end_bug_line: 数据类型为
int64 - bug: 数据类型为
string - fix: 数据类型为
string - ctx: 数据类型为
string
数据分割
- train: 字节数为 15408906,样本数为 15574
- validation: 字节数为 1921798,样本数为 1946
- test: 字节数为 1911003,样本数为 1946
数据大小
- 下载大小: 6808492 字节
- 数据集大小: 19241707 字节
配置
- config_name: default
- data_files:
- train: 路径为
data/train-* - validation: 路径为
data/validation-* - test: 路径为
data/test-*
- train: 路径为
- data_files:



