lihaoxin2020/rl_hard_gpt5_sft_gpt54rubric_v2
收藏Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/lihaoxin2020/rl_hard_gpt5_sft_gpt54rubric_v2
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: conversations
list:
- name: role
dtype: string
- name: content
dtype: string
- name: thinking
dtype: string
- name: metadata
struct:
- name: sample_id
dtype: string
- name: traj_idx
dtype: int64
- name: turn_index
dtype: int64
- name: tool_name
dtype: string
- name: tool_query
dtype: string
- name: refiner_mode
dtype: string
- name: stop_reason
dtype: string
- name: accepted
dtype: bool
- name: pass
dtype: int64
- name: format_bonus
dtype: float64
- name: citation_format_reward
dtype: float64
- name: citation_paper_reward
dtype: float64
- name: citation_metrics
struct:
- name: citation_format_reward
dtype: float64
- name: citation_avg_claim_recall
dtype: float64
- name: citation_avg_claim_precision
dtype: float64
- name: citation_avg_claim_f1
dtype: float64
- name: citation_paper_reward
dtype: float64
- name: citation_claim_count
dtype: float64
- name: citation_uncited_claim_count
dtype: float64
- name: citation_score_applicable
dtype: float64
- name: gpt5_generation
dtype: string
- name: rubrics
dtype: string
splits:
- name: train
num_examples: 5195
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# rl_hard_gpt5_sft_gpt54rubric_v2
Per-instance rubrics generated by GPT-5.4 for the `lihaoxin2020/rl_hard_gpt5_sft` data.
Each row has a `rubrics` column (JSON string) with
`{"positive_rubrics": [...], "negative_rubrics": [...]}`,
each item `{title, description}`, capped at 5 rubrics per instance.
## Difference from v1 (lihaoxin2020/rl_hard_gpt5_sft_gpt54rubric)
An LLM audit of v1 found that rubrics were well-tuned for the core answer
requirement (direct, cited answer; no fabrication) but systematically did
not cover the cases where snippets only **partially** answer the query or
do not answer it at all. In v2 the rubric-generator prompt was updated
to explicitly classify each instance by **snippet adequacy** (full /
partial / none) and include the case-appropriate rubrics:
- **PARTIAL / NONE cases** now include:
- A positive rubric rewarding **engaged partial-grounding + forward
guidance**: citing whichever related clues are present in the snippets
AND suggesting a specific next search that would close the gap — not
a boilerplate "no info found" refusal.
- A negative rubric naming a **concrete close-but-irrelevant detail**
from this example's snippets that a weaker model would be tempted to
volunteer as if it answered the query (e.g. for "where did A's father
die", a sibling's residence mentioned in the snippets).
- **FULL cases** now include a negative rubric penalizing **scope creep** —
a concrete off-query detail present in the snippets that the query did
not ask for. Only written when a concrete distractor exists.
On a 30-instance pilot audit, rubric-sets matching the specification
improved from **1/30 (3%)** in v1 to **20/30 (67%)** in v2.
## Schema
Same as v1; the `rubrics` column is a JSON string with two parallel lists
of `{title, description}`. Parse with `json.loads(row["rubrics"])`.
---
dataset_info:
数据集信息:
features:
- name: conversations(对话)
list:
- name: role
dtype: 字符串类型
- name: content
dtype: 字符串类型
- name: thinking(思考过程)
dtype: 字符串类型
- name: metadata(元数据)
struct:
- name: sample_id(样本ID)
dtype: 字符串类型
- name: traj_idx(轨迹索引)
dtype: 64位整型
- name: turn_index(轮次索引)
dtype: 64位整型
- name: tool_name(工具名称)
dtype: 字符串类型
- name: tool_query(工具查询词)
dtype: 字符串类型
- name: refiner_mode(优化器模式)
dtype: 字符串类型
- name: stop_reason(停止原因)
dtype: 字符串类型
- name: accepted(是否接受)
dtype: 布尔类型
- name: pass(通过次数)
dtype: 64位整型
- name: format_bonus(格式奖励)
dtype: 浮点类型
- name: citation_format_reward(引用格式奖励)
dtype: 浮点类型
- name: citation_paper_reward(引用文献奖励)
dtype: 浮点类型
- name: citation_metrics(引用指标)
struct:
- name: citation_format_reward(引用格式奖励)
dtype: 浮点类型
- name: citation_avg_claim_recall(引用平均主张召回率)
dtype: 浮点类型
- name: citation_avg_claim_precision(引用平均主张精确率)
dtype: 浮点类型
- name: citation_avg_claim_f1(引用平均主张F1值)
dtype: 浮点类型
- name: citation_paper_reward(引用文献奖励)
dtype: 浮点类型
- name: citation_claim_count(引用主张总数)
dtype: 浮点类型
- name: citation_uncited_claim_count(未被引用的主张总数)
dtype: 浮点类型
- name: citation_score_applicable(适用引用分数)
dtype: 浮点类型
- name: gpt5_generation(GPT-5生成内容)
dtype: 字符串类型
- name: rubrics(评分标准)
dtype: 字符串类型
splits:
- name: train(训练集)
num_examples: 5195
configs:
- config_name: default(默认配置)
data_files:
- split: train
path: data/train-*
---
# rl_hard_gpt5_sft_gpt54rubric_v2
针对`lihaoxin2020/rl_hard_gpt5_sft`数据集生成的逐实例评分标准,由GPT-5.4完成。
每条数据包含一个`rubrics`列(JSON字符串格式),其结构为`{"positive_rubrics": [...], "negative_rubrics": [...]}`,每个条目为`{title, description}`格式,单个实例最多包含5条评分标准。
## 与v1版本(lihaoxin2020/rl_hard_gpt5_sft_gpt54rubric)的差异
对v1版本开展的大语言模型(Large Language Model, LLM)审计结果显示,评分标准已针对核心回答要求(直接引用、无虚构内容的回答)完成充分调优,但系统性未覆盖仅**部分**回答查询或完全未回答查询的场景。在v2版本中,评分标准生成提示词已更新,以显式地根据**片段充足性**(完整/部分/无)对每个实例进行分类,并添加适配对应场景的评分标准:
- **部分回答/无回答**场景新增以下评分标准:
- 一条正向评分标准,奖励**针对性的部分信息锚定(partial-grounding)与后续引导**:即引用片段中存在的相关线索,并提出可填补信息缺口的具体下一步搜索建议,而非采用模板化的“未找到相关信息”式拒绝回答。
- 一条负向评分标准,标注该实例片段中存在的**具体近似但无关细节**——即性能较弱的模型会倾向于将其作为回答查询的依据(例如,当查询为“A的父亲于何处去世”时,片段中提及的A的兄弟姐妹的居住地)。
- **完整回答**场景新增一条负向评分标准,用于惩罚**范围溢出(scope creep)**:即片段中存在查询未要求的具体无关细节。仅当存在明确干扰项时才会添加该评分标准。
在包含30个实例的试点审计中,符合规范的评分标准集占比从v1版本的**1/30(3%)**提升至v2版本的**20/30(67%)**。
## 数据结构
与v1版本一致;`rubrics`列为JSON字符串,包含两个并行的`{title, description}`格式列表。可通过`json.loads(row["rubrics"])`进行解析。
提供机构:
lihaoxin2020



