haoranli-ml/genvf-sdpo-filtered_0.6
收藏Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/haoranli-ml/genvf-sdpo-filtered_0.6
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: index
dtype: int64
- name: row_id
dtype: int64
- name: problem
dtype: string
- name: answer
dtype: string
- name: source
list: string
- name: mean_reward
dtype: float64
- name: full_response
dtype: string
- name: full_reasoning
dtype: string
- name: model
dtype: string
- name: prefix
dtype: string
- name: prefix_end_index
dtype: int64
- name: num_thoughts
dtype: int64
- name: prefix_type
dtype: string
- name: prefix_type_description
dtype: string
- name: suffix_num
list: int64
- name: suffix_model
list: string
- name: pending
list: bool
- name: pending_model
list: 'null'
- name: suffix_response
list: string
- name: suffix_summary
list: string
- name: self_summary
list: string
- name: suffix_reasoning
list: string
- name: finish_reason
list: string
- name: budget_used
list: int64
- name: escalation
list: int64
- name: usage
list:
- name: completion_tokens
dtype: int64
- name: prompt_tokens
dtype: int64
- name: total_tokens
dtype: int64
- name: error
list: 'null'
- name: error_type
list: 'null'
- name: prefix_model
dtype: string
- name: gemini_summary_of_future
dtype: string
- name: gemini_summary_list
list: string
- name: prefix_steps
list: string
- name: suffix_variants
list:
- name: detailed_steps
list: string
- name: high_level_steps
list: string
- name: id
dtype: int64
- name: dedup_note
dtype: string
- name: cross_prefix_alignment_scores
list:
- name: avg_alignment
dtype: float64
- name: individual_scores
list:
- name: compared_row_id
dtype: int64
- name: compared_summary_id
dtype: int64
- name: direction
dtype: string
- name: output_text
dtype: string
- name: problem_index
dtype: int64
- name: reasoning
dtype: string
- name: score
dtype: float64
- name: num_comparisons
dtype: int64
- name: summary_id
dtype: int64
- name: filtered_suffix
list:
- name: detailed_steps
list: string
- name: high_level_steps
list: string
- name: id
dtype: int64
- name: rubrics
dtype: string
- name: prefix_summary_steps
dtype: string
- name: filtered_suffix_summary_steps
list: string
- name: input_to_VF
dtype: string
splits:
- name: train
num_bytes: 1361875991
num_examples: 4410
- name: test
num_bytes: 30386279
num_examples: 100
download_size: 1160459360
dataset_size: 1392262270
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
数据集信息:
特征字段:
- 名称: 索引 (index)
dtype: int64
- 名称: 行ID (row_id)
dtype: int64
- 名称: 问题 (problem)
dtype: 字符串 (string)
- 名称: 答案 (answer)
dtype: 字符串 (string)
- 名称: 来源 (source)
dtype: 字符串列表 (list: string)
- 名称: 平均奖励 (mean_reward)
dtype: float64
- 名称: 完整响应 (full_response)
dtype: 字符串 (string)
- 名称: 完整推理过程 (full_reasoning)
dtype: 字符串 (string)
- 名称: 模型 (model)
dtype: 字符串 (string)
- 名称: 前缀 (prefix)
dtype: 字符串 (string)
- 名称: 前缀结束索引 (prefix_end_index)
dtype: int64
- 名称: 思考次数 (num_thoughts)
dtype: int64
- 名称: 前缀类型 (prefix_type)
dtype: 字符串 (string)
- 名称: 前缀类型描述 (prefix_type_description)
dtype: 字符串 (string)
- 名称: 后缀数量 (suffix_num)
dtype: 整数列表 (list: int64)
- 名称: 后缀模型 (suffix_model)
dtype: 字符串列表 (list: string)
- 名称: 待处理状态 (pending)
dtype: 布尔值列表 (list: bool)
- 名称: 待处理模型 (pending_model)
dtype: 空值列表 (list: 'null')
- 名称: 后缀响应 (suffix_response)
dtype: 字符串列表 (list: string)
- 名称: 后缀摘要 (suffix_summary)
dtype: 字符串列表 (list: string)
- 名称: 自摘要 (self_summary)
dtype: 字符串列表 (list: string)
- 名称: 后缀推理过程 (suffix_reasoning)
dtype: 字符串列表 (list: string)
- 名称: 终止原因 (finish_reason)
dtype: 字符串列表 (list: string)
- 名称: 已使用预算 (budget_used)
dtype: 整数列表 (list: int64)
- 名称: 升级次数 (escalation)
dtype: 整数列表 (list: int64)
- 名称: 令牌使用情况 (usage)
list:
- name: 补全令牌数 (completion_tokens)
dtype: int64
- name: 提示令牌数 (prompt_tokens)
dtype: int64
- name: 总令牌数 (total_tokens)
dtype: int64
- 名称: 错误信息 (error)
dtype: 空值列表 (list: 'null')
- 名称: 错误类型 (error_type)
dtype: 空值列表 (list: 'null')
- 名称: 前缀模型 (prefix_model)
dtype: 字符串 (string)
- 名称: Gemini未来摘要 (gemini_summary_of_future)
dtype: 字符串 (string)
- 名称: Gemini摘要列表 (gemini_summary_list)
dtype: 字符串列表 (list: string)
- 名称: 前缀步骤 (prefix_steps)
dtype: 字符串列表 (list: string)
- 名称: 后缀变体 (suffix_variants)
list:
- name: 详细步骤 (detailed_steps)
list: string
- name: 高层步骤 (high_level_steps)
list: string
- name: 编号 (id)
dtype: int64
- 名称: 去重备注 (dedup_note)
dtype: 字符串 (string)
- 名称: 跨前缀对齐分数 (cross_prefix_alignment_scores)
list:
- name: 平均对齐分数 (avg_alignment)
dtype: float64
- name: 单个分数 (individual_scores)
list:
- name: 对比行ID (compared_row_id)
dtype: int64
- name: 对比摘要ID (compared_summary_id)
dtype: int64
- name: 方向 (direction)
dtype: string
- name: 输出文本 (output_text)
dtype: string
- name: 问题索引 (problem_index)
dtype: int64
- name: 推理过程 (reasoning)
dtype: string
- name: 分数 (score)
dtype: float64
- name: 对比次数 (num_comparisons)
dtype: int64
- name: 摘要ID (summary_id)
dtype: int64
- 名称: 过滤后后缀 (filtered_suffix)
list:
- name: 详细步骤 (detailed_steps)
list: string
- name: 高层步骤 (high_level_steps)
list: string
- name: 编号 (id)
dtype: int64
- 名称: 评分标准 (rubrics)
dtype: 字符串 (string)
- 名称: 前缀摘要步骤 (prefix_summary_steps)
dtype: 字符串 (string)
- 名称: 过滤后后缀摘要步骤 (filtered_suffix_summary_steps)
list: string
- 名称: VF输入 (input_to_VF)
dtype: 字符串 (string)
划分集:
- name: 训练集 (train)
num_bytes: 1361875991
num_examples: 4410
- name: 测试集 (test)
num_bytes: 30386279
num_examples: 100
download_size: 1160459360
dataset_size: 1392262270
配置项:
- config_name: 默认 (default)
data_files:
- split: 训练集 (train)
path: data/train-*
- split: 测试集 (test)
path: data/test-*
提供机构:
haoranli-ml



