Name: haoranli-ml/genvf-sdpo-filtered_0.6
Creator: haoranli-ml
Published: 2026-04-01 20:18:42
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/haoranli-ml/genvf-sdpo-filtered_0.6

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: index dtype: int64 - name: row_id dtype: int64 - name: problem dtype: string - name: answer dtype: string - name: source list: string - name: mean_reward dtype: float64 - name: full_response dtype: string - name: full_reasoning dtype: string - name: model dtype: string - name: prefix dtype: string - name: prefix_end_index dtype: int64 - name: num_thoughts dtype: int64 - name: prefix_type dtype: string - name: prefix_type_description dtype: string - name: suffix_num list: int64 - name: suffix_model list: string - name: pending list: bool - name: pending_model list: 'null' - name: suffix_response list: string - name: suffix_summary list: string - name: self_summary list: string - name: suffix_reasoning list: string - name: finish_reason list: string - name: budget_used list: int64 - name: escalation list: int64 - name: usage list: - name: completion_tokens dtype: int64 - name: prompt_tokens dtype: int64 - name: total_tokens dtype: int64 - name: error list: 'null' - name: error_type list: 'null' - name: prefix_model dtype: string - name: gemini_summary_of_future dtype: string - name: gemini_summary_list list: string - name: prefix_steps list: string - name: suffix_variants list: - name: detailed_steps list: string - name: high_level_steps list: string - name: id dtype: int64 - name: dedup_note dtype: string - name: cross_prefix_alignment_scores list: - name: avg_alignment dtype: float64 - name: individual_scores list: - name: compared_row_id dtype: int64 - name: compared_summary_id dtype: int64 - name: direction dtype: string - name: output_text dtype: string - name: problem_index dtype: int64 - name: reasoning dtype: string - name: score dtype: float64 - name: num_comparisons dtype: int64 - name: summary_id dtype: int64 - name: filtered_suffix list: - name: detailed_steps list: string - name: high_level_steps list: string - name: id dtype: int64 - name: rubrics dtype: string - name: prefix_summary_steps dtype: string - name: filtered_suffix_summary_steps list: string - name: input_to_VF dtype: string splits: - name: train num_bytes: 1361875991 num_examples: 4410 - name: test num_bytes: 30386279 num_examples: 100 download_size: 1160459360 dataset_size: 1392262270 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* ---

数据集信息: 特征字段: - 名称: 索引 (index) dtype: int64 - 名称: 行ID (row_id) dtype: int64 - 名称: 问题 (problem) dtype: 字符串 (string) - 名称: 答案 (answer) dtype: 字符串 (string) - 名称: 来源 (source) dtype: 字符串列表 (list: string) - 名称: 平均奖励 (mean_reward) dtype: float64 - 名称: 完整响应 (full_response) dtype: 字符串 (string) - 名称: 完整推理过程 (full_reasoning) dtype: 字符串 (string) - 名称: 模型 (model) dtype: 字符串 (string) - 名称: 前缀 (prefix) dtype: 字符串 (string) - 名称: 前缀结束索引 (prefix_end_index) dtype: int64 - 名称: 思考次数 (num_thoughts) dtype: int64 - 名称: 前缀类型 (prefix_type) dtype: 字符串 (string) - 名称: 前缀类型描述 (prefix_type_description) dtype: 字符串 (string) - 名称: 后缀数量 (suffix_num) dtype: 整数列表 (list: int64) - 名称: 后缀模型 (suffix_model) dtype: 字符串列表 (list: string) - 名称: 待处理状态 (pending) dtype: 布尔值列表 (list: bool) - 名称: 待处理模型 (pending_model) dtype: 空值列表 (list: 'null') - 名称: 后缀响应 (suffix_response) dtype: 字符串列表 (list: string) - 名称: 后缀摘要 (suffix_summary) dtype: 字符串列表 (list: string) - 名称: 自摘要 (self_summary) dtype: 字符串列表 (list: string) - 名称: 后缀推理过程 (suffix_reasoning) dtype: 字符串列表 (list: string) - 名称: 终止原因 (finish_reason) dtype: 字符串列表 (list: string) - 名称: 已使用预算 (budget_used) dtype: 整数列表 (list: int64) - 名称: 升级次数 (escalation) dtype: 整数列表 (list: int64) - 名称: 令牌使用情况 (usage) list: - name: 补全令牌数 (completion_tokens) dtype: int64 - name: 提示令牌数 (prompt_tokens) dtype: int64 - name: 总令牌数 (total_tokens) dtype: int64 - 名称: 错误信息 (error) dtype: 空值列表 (list: 'null') - 名称: 错误类型 (error_type) dtype: 空值列表 (list: 'null') - 名称: 前缀模型 (prefix_model) dtype: 字符串 (string) - 名称: Gemini未来摘要 (gemini_summary_of_future) dtype: 字符串 (string) - 名称: Gemini摘要列表 (gemini_summary_list) dtype: 字符串列表 (list: string) - 名称: 前缀步骤 (prefix_steps) dtype: 字符串列表 (list: string) - 名称: 后缀变体 (suffix_variants) list: - name: 详细步骤 (detailed_steps) list: string - name: 高层步骤 (high_level_steps) list: string - name: 编号 (id) dtype: int64 - 名称: 去重备注 (dedup_note) dtype: 字符串 (string) - 名称: 跨前缀对齐分数 (cross_prefix_alignment_scores) list: - name: 平均对齐分数 (avg_alignment) dtype: float64 - name: 单个分数 (individual_scores) list: - name: 对比行ID (compared_row_id) dtype: int64 - name: 对比摘要ID (compared_summary_id) dtype: int64 - name: 方向 (direction) dtype: string - name: 输出文本 (output_text) dtype: string - name: 问题索引 (problem_index) dtype: int64 - name: 推理过程 (reasoning) dtype: string - name: 分数 (score) dtype: float64 - name: 对比次数 (num_comparisons) dtype: int64 - name: 摘要ID (summary_id) dtype: int64 - 名称: 过滤后后缀 (filtered_suffix) list: - name: 详细步骤 (detailed_steps) list: string - name: 高层步骤 (high_level_steps) list: string - name: 编号 (id) dtype: int64 - 名称: 评分标准 (rubrics) dtype: 字符串 (string) - 名称: 前缀摘要步骤 (prefix_summary_steps) dtype: 字符串 (string) - 名称: 过滤后后缀摘要步骤 (filtered_suffix_summary_steps) list: string - 名称: VF输入 (input_to_VF) dtype: 字符串 (string) 划分集: - name: 训练集 (train) num_bytes: 1361875991 num_examples: 4410 - name: 测试集 (test) num_bytes: 30386279 num_examples: 100 download_size: 1160459360 dataset_size: 1392262270 配置项: - config_name: 默认 (default) data_files: - split: 训练集 (train) path: data/train-* - split: 测试集 (test) path: data/test-*

应用场景：