EngineeringSoftware/exLong-dataset
收藏Hugging Face2024-11-07 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/EngineeringSoftware/exLong-dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-generation
tags:
- code
size_categories:
- 1K<n<10K
configs:
- config_name: with-EBT-name
data_files:
- split: train
path: "with-EBT-name/train-conditionnestack2e-with-name-ft.jsonl"
- split: test
path: "with-EBT-name/test-conditionnestack2e-with-name-ft.jsonl"
- config_name: no-EBT-name
data_files:
- split: train
path: "no-EBT-name/train-conditionnestack2e-no-name-ft.jsonl"
- split: test
path: "no-EBT-name/test-conditionnestack2e-no-name-ft.jsonl"
---
# exLong Dataset
This dataset is used to train and evaluate the exLong models on generating exceptional-behavior tests.
It has **two** subsets:
- 'with-EBT-name': provides the target test name in the prompt
- 'no-EBT-name': does not provide the target test name in the prompt
**NOTE**: the data format is customized for Code Llama models.
## Language
This is a Java dataset
## Dataset Structure
The input for the model contains the following context:
1. Method under test (MUT)
2. Relevant non-exceptional behavior test (nEBT)
3. Guard expression: the logical formula representing the constraints on the symbolic variables that must be true to follow that particular trace
4. Stack trace: the sequence of method invocations that start from the MUT and lead to the target throw statement
5. (Optional) Exceptional-behavior test name
## Data fields
1. id: the identifier for each data example
2. instruction: prompt containing all the context
3. output: target exceptional-behavior test
许可证:MIT许可证
任务类别:文本生成
标签:代码
样本规模:1000 < 样本数量 < 10000
配置项:
- 配置名称:with-EBT-name
数据文件:
- 训练拆分:路径为"with-EBT-name/train-conditionnestack2e-with-name-ft.jsonl"
- 测试拆分:路径为"with-EBT-name/test-conditionnestack2e-with-name-ft.jsonl"
- 配置名称:no-EBT-name
数据文件:
- 训练拆分:路径为"no-EBT-name/train-conditionnestack2e-no-name-ft.jsonl"
- 测试拆分:路径为"no-EBT-name/test-conditionnestack2e-no-name-ft.jsonl"
# exLong数据集
本数据集用于训练和评估exLong模型,以生成异常行为测试(Exceptional-Behavior Test, EBT)。
本数据集包含**两个**子集:
- with-EBT-name:在提示词中提供目标异常行为测试名称
- no-EBT-name:不在提示词中提供目标异常行为测试名称
**注意**:本数据集的数据格式专为Code Llama模型定制。
## 语言说明
本数据集采用Java语言编写。
## 数据集结构
模型的输入包含以下上下文信息:
1. 待测方法(Method Under Test, MUT)
2. 相关非异常行为测试(non-Exceptional-Behavior Test, nEBT)
3. 保护表达式:表示遵循特定执行路径时,符号变量需满足的约束逻辑公式
4. 堆栈跟踪:从待测方法出发,直至目标抛出语句的方法调用序列
5. (可选)异常行为测试名称
## 数据字段说明
1. id:每条数据样本的唯一标识符
2. instruction:包含全部上下文信息的提示词
3. output:目标异常行为测试
提供机构:
EngineeringSoftware



