efederici/shp-partial-it
收藏Hugging Face2023-04-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/efederici/shp-partial-it
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- text-generation
- question-answering
language:
- it
tags:
- RLHF
- preferences
- RL
- human feedback
- reddit
size_categories:
- 10K<n<100K
---
# 🚢 Stanford Human Preferences Dataset (SHP) (Italian Translation)
The Stanford Human Preferences Dataset (SHP) is a collection of responses to questions and instructions in 18 different subject areas, ranging from cooking to legal advice. This version of the dataset is a **partial** Italian translation of the original English dataset.
Please note that the quality of the translations has not been verified. However, the dataset may still be useful for training models.
Each example in the dataset consists of a Reddit post that includes a question or instruction and a pair of top-level comments. The comments are ranked according to their perceived helpfulness by Reddit users. SHP uses the fact that if comment A has a higher score than comment B despite being written after B, then A is considered more preferred.
The preference labels in the dataset reflect the helpfulness of a response, rather than identifying harmful responses. This approach differs from previous work that focused on identifying harmful responses.
提供机构:
efederici
原始信息汇总
数据集概述
数据集名称
Stanford Human Preferences Dataset (SHP) (Italian Translation)
数据集描述
这是一个部分意大利语翻译的原始英语数据集,包含18个不同主题领域的问答和指导内容。数据集中的每个示例包括一个Reddit帖子,其中包含一个问题或指令以及一对顶级评论。评论根据Reddit用户的感知帮助性进行排名。
数据集特点
- 语言: 意大利语
- 任务类别:
- 文本生成
- 问答
- 标签:
- RLHF
- 偏好
- RL
- 人类反馈
- 大小类别: 10K<n<100K
数据集使用注意事项
- 翻译质量未经核实,但仍可用于模型训练。
- 偏好标签反映的是响应的帮助性,而非识别有害响应。



