Tongyi-ConvAI/RM-NLHF
收藏Hugging Face2026-02-25 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/Tongyi-ConvAI/RM-NLHF
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
---
# 💡 Reward Modeling from Natural Language Human Feedback
<p align="left">
<a href="https://arxiv.org/abs/2601.07349">
<img
src="https://img.shields.io/badge/arXiv-RM--NLHF-red?logo=arxiv" style="display: inline-block; vertical-align: middle;"
alt="RM-NLHF Paper on arXiv"
/>
</a>
<a href="https://github.com/Tongyi-ConvAI/Qwen-Character/tree/main/Character-GenRM-NLHF" target="_blank" style="margin: 2px;">
<img
alt="Github" src="https://img.shields.io/badge/Github -RM--NLHF--Codebase-536af5?color=536af5&logo=github" style="display: inline-block; vertical-align: middle;"
alt="RM-NLHF Codebase"
/>
</a>
<a href="https://huggingface.co/Tongyi-ConvAI/RM-NLHF-Qwen-7B" target="_blank" style="margin: 2px;">
<img
alt="HF Model: RM-NLHF-7B" src="https://img.shields.io/badge/%F0%9F%A4%97%20_Model-RM--NLHF--7B-ffc107?color=ffc107&logoColor=white" style="display: inline-block; vertical-align: middle;"
alt="HF Model: RM-NLHF-7B"
/>
</a>
<a href="https://huggingface.co/Tongyi-ConvAI/RM-NLHF-Qwen-7B" target="_blank" style="margin: 2px;">
<img
alt="HF Model: RM-NLHF-32B" src="https://img.shields.io/badge/%F0%9F%A4%97%20_Model-RM--NLHF--32B-ffc107?color=ffc107&logoColor=white" style="display: inline-block; vertical-align: middle;"
alt="HF Model: RM-NLHF-32B"
/>
</a>
</p>
This is the official dataset used in paper "Reward Modeling from Natural Language Human Feedback".
# 🔑 Key Features
- RM-NLHF integrates multiple preference datasets.
- We employ Qwen3-235B-A22B-2507 to extract key points from the human-annotated commentary portion of HelpSteer3 and reformat them into structured bullet-point lists.
- In this repository, we are open-sourcing only the HelpSteer3 and Tulu-3-Pref-Personas-Instruction-Following portions of the data.
# 🧷 Citation
```
@misc{wang2026rewardmodelingnaturallanguage,
title={Reward Modeling from Natural Language Human Feedback},
author={Zongqi Wang and Rui Wang and Yuchuan Wu and Yiyao Yu and Pinyi Zhang and Shaoning Sun and Yujiu Yang and Yongbin Li},
year={2026},
eprint={2601.07349},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.07349},
}
```
提供机构:
Tongyi-ConvAI



