SoftAge-AI/rlhf-ranking_dataset
收藏Hugging Face2024-03-06 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/SoftAge-AI/rlhf-ranking_dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
---
# RLHF Response Ranking Dataset
## Description
This dataset supports research in Response Ranking for Large Language Models (RLHF) in the CODE & STEM domain.
It contains 500 prompt-response pairs, each with the following data attributes:
- M_Id & S.No.: Unique identifier for the prompt-response pair.
- Prompt: The original query or problem statement.
- Response 1 & 2: Responses generated by different language models.
- prompt_type: Category of the prompt (e.g., mathematical equation, coding problem).
- Preference: Indicates which response is considered better (1 or 2).
- Remark: Additional information about the ranking decision.
- Safety labels (all Y/N):
- Fails to follow instructions
- Contains sexual content
- Contains violent content
- Encourages harmful behavior
- Expresses moral judgment
- Gives harmful advice
## Dataset Source
This dataset is curated by the delivery team @SoftAge
## Limitations and Biases
- This dataset might not capture the full diversity of CODE & STEM problems and response qualities.
- Preference labels and safety ratings might reflect the inherent biases of human annotators or domain experts.
## Potential Uses
• Training and analysing RLHF models for generating informative and safe responses in the CODE & STEM domain.
• Identifying areas for improvement in language models.
• Developing new metrics and methods for RLHF in different domains.
提供机构:
SoftAge-AI
原始信息汇总
RLHF Response Ranking Dataset
描述
该数据集支持在CODE & STEM领域中对大型语言模型(RLHF)的响应排序研究。
它包含500个提示-响应对,每个对具有以下数据属性:
- M_Id & S.No.: 提示-响应对的唯一标识符。
- Prompt: 原始查询或问题陈述。
- Response 1 & 2: 由不同语言模型生成的响应。
- prompt_type: 提示的类别(例如,数学方程,编程问题)。
- Preference: 指示哪个响应被认为是更好的(1或2)。
- Remark: 关于排序决策的附加信息。
- Safety labels (all Y/N):
- 未能遵循指令
- 包含性内容
- 包含暴力内容
- 鼓励有害行为
- 表达道德判断
- 给出有害建议
数据集来源
该数据集由SoftAge的交付团队策划。
限制和偏见
- 该数据集可能无法捕捉CODE & STEM问题和响应质量的全部多样性。
- 偏好标签和安全评级可能反映了人类标注者或领域专家的固有偏见。
潜在用途
- 训练和分析RLHF模型,以在CODE & STEM领域生成信息丰富且安全的响应。
- 识别语言模型改进的领域。
- 开发不同领域中RLHF的新指标和方法。



