five

summarize-from-feedback

收藏
OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/summarize-from-feedback
下载链接
链接失效反馈
官方服务:
资源简介:
In the Learning to Summarize from Human Feedback paper, a reward model was trained from human feedback. The reward model was then used to train a summarization model to align with human preferences. This is the dataset of human feedback that was released for reward modelling. There are two parts of this dataset: comparisons and axis. In the comparisons part, human annotators were asked to choose the best out of two summaries. In the axis part, human annotators gave scores on a likert scale for the quality of a summary. The comparisons part only has a train and validation split, and the axis part only has a test and validation split. The summaries used for training the reward model in the paper come from the TL;DR dataset. Additional validation and test data come from the TL;DR dataset, CNN articles, and Daily Mail articles.
提供机构:
OpenDataLab
创建时间:
2023-12-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作