five

Tongyi-ConvAI/RM-NLHF

收藏
Hugging Face2026-02-25 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/Tongyi-ConvAI/RM-NLHF
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 --- # 💡 Reward Modeling from Natural Language Human Feedback <p align="left"> <a href="https://arxiv.org/abs/2601.07349"> <img src="https://img.shields.io/badge/arXiv-RM--NLHF-red?logo=arxiv" style="display: inline-block; vertical-align: middle;" alt="RM-NLHF Paper on arXiv" /> </a> <a href="https://github.com/Tongyi-ConvAI/Qwen-Character/tree/main/Character-GenRM-NLHF" target="_blank" style="margin: 2px;"> <img alt="Github" src="https://img.shields.io/badge/Github -RM--NLHF--Codebase-536af5?color=536af5&logo=github" style="display: inline-block; vertical-align: middle;" alt="RM-NLHF Codebase" /> </a> <a href="https://huggingface.co/Tongyi-ConvAI/RM-NLHF-Qwen-7B" target="_blank" style="margin: 2px;"> <img alt="HF Model: RM-NLHF-7B" src="https://img.shields.io/badge/%F0%9F%A4%97%20_Model-RM--NLHF--7B-ffc107?color=ffc107&logoColor=white" style="display: inline-block; vertical-align: middle;" alt="HF Model: RM-NLHF-7B" /> </a> <a href="https://huggingface.co/Tongyi-ConvAI/RM-NLHF-Qwen-7B" target="_blank" style="margin: 2px;"> <img alt="HF Model: RM-NLHF-32B" src="https://img.shields.io/badge/%F0%9F%A4%97%20_Model-RM--NLHF--32B-ffc107?color=ffc107&logoColor=white" style="display: inline-block; vertical-align: middle;" alt="HF Model: RM-NLHF-32B" /> </a> </p> This is the official dataset used in paper "Reward Modeling from Natural Language Human Feedback". # 🔑 Key Features - RM-NLHF integrates multiple preference datasets. - We employ Qwen3-235B-A22B-2507 to extract key points from the human-annotated commentary portion of HelpSteer3 and reformat them into structured bullet-point lists. - In this repository, we are open-sourcing only the HelpSteer3 and Tulu-3-Pref-Personas-Instruction-Following portions of the data. # 🧷 Citation ``` @misc{wang2026rewardmodelingnaturallanguage, title={Reward Modeling from Natural Language Human Feedback}, author={Zongqi Wang and Rui Wang and Yuchuan Wu and Yiyao Yu and Pinyi Zhang and Shaoning Sun and Yujiu Yang and Yongbin Li}, year={2026}, eprint={2601.07349}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2601.07349}, } ```
提供机构:
Tongyi-ConvAI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作