qfq/eidata_dagger_20241027_010023_iter2
收藏Hugging Face2024-10-30 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/qfq/eidata_dagger_20241027_010023_iter2
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: doc_id
dtype: int64
- name: doc
struct:
- name: orig_problem
dtype: string
- name: orig_solution
dtype: string
- name: orig_answer
dtype: string
- name: thinking_trajectory
sequence: string
- name: golden_thinking_trajectory
sequence: string
- name: old_trajectory
sequence: string
- name: labeled_trajectory
sequence: string
- name: problem
dtype: string
- name: solution
dtype: string
- name: answer
dtype: string
- name: target
dtype: string
- name: arguments
struct:
- name: gen_args_0
struct:
- name: arg_0
dtype: string
- name: arg_1
struct:
- name: until
sequence: string
- name: do_sample
dtype: bool
- name: temperature
dtype: float64
- name: max_gen_toks
dtype: int64
- name: resps
sequence:
sequence: string
- name: filtered_resps
sequence: string
- name: doc_hash
dtype: string
- name: prompt_hash
dtype: string
- name: target_hash
dtype: string
- name: exact_match
dtype: int64
- name: orig_problem
dtype: string
- name: orig_solution
dtype: string
- name: orig_answer
dtype: string
- name: thinking_trajectory
sequence: string
- name: golden_thinking_trajectory
sequence: string
- name: old_trajectory
sequence: string
- name: labeled_trajectory
sequence: string
- name: problem
dtype: string
- name: solution
dtype: string
- name: answer
dtype: string
- name: text
dtype: string
splits:
- name: train
num_bytes: 116387990
num_examples: 8032
- name: test
num_bytes: 6267220
num_examples: 423
download_size: 62630538
dataset_size: 122655210
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
数据集信息:
特征字段:
- 字段名:doc_id(文档ID),数据类型:64位整数
- 字段名:doc(文档),结构字段包含:
- 字段名:orig_problem(原始问题),数据类型:字符串
- 字段名:orig_solution(原始解答),数据类型:字符串
- 字段名:orig_answer(原始答案),数据类型:字符串
- 字段名:thinking_trajectory(思维轨迹),数据类型:字符串序列
- 字段名:golden_thinking_trajectory(标准思维轨迹),数据类型:字符串序列
- 字段名:old_trajectory(旧轨迹),数据类型:字符串序列
- 字段名:labeled_trajectory(标注轨迹),数据类型:字符串序列
- 字段名:problem(问题),数据类型:字符串
- 字段名:solution(解答),数据类型:字符串
- 字段名:answer(答案),数据类型:字符串
- 字段名:target(目标),数据类型:字符串
- 字段名:arguments(参数集),结构字段包含:
- 字段名:gen_args_0(生成参数0),结构字段包含:
- 字段名:arg_0(参数0),数据类型:字符串
- 字段名:arg_1(参数1),结构字段包含:
- 字段名:until(终止条件),数据类型:字符串序列
- 字段名:do_sample(采样开关),数据类型:布尔值
- 字段名:temperature(温度系数),数据类型:双精度浮点数
- 字段名:max_gen_toks(最大生成Token数),数据类型:64位整数
- 字段名:resps(响应集),数据类型:字符串序列的序列
- 字段名:filtered_resps(过滤后响应集),数据类型:字符串序列
- 字段名:doc_hash(文档哈希值),数据类型:字符串
- 字段名:prompt_hash(提示词哈希值),数据类型:字符串
- 字段名:target_hash(目标哈希值),数据类型:字符串
- 字段名:exact_match(精确匹配指标),数据类型:64位整数
- 字段名:orig_problem(原始问题),数据类型:字符串
- 字段名:orig_solution(原始解答),数据类型:字符串
- 字段名:orig_answer(原始答案),数据类型:字符串
- 字段名:thinking_trajectory(思维轨迹),数据类型:字符串序列
- 字段名:golden_thinking_trajectory(标准思维轨迹),数据类型:字符串序列
- 字段名:old_trajectory(旧轨迹),数据类型:字符串序列
- 字段名:labeled_trajectory(标注轨迹),数据类型:字符串序列
- 字段名:problem(问题),数据类型:字符串
- 字段名:solution(解答),数据类型:字符串
- 字段名:answer(答案),数据类型:字符串
- 字段名:text(文本),数据类型:字符串
数据集划分:
- 划分名称:train(训练集),字节占用量:116387990,样本数量:8032
- 划分名称:test(测试集),字节占用量:6267220,样本数量:423
下载总大小:62630538字节
数据集总大小:122655210字节
配置项:
- 配置名称:default(默认配置),数据文件路径:
- 训练集:data/train-*
- 测试集:data/test-*
提供机构:
qfq



