nvidia/Nemotron-RLHF-GenRM-v1

Name: nvidia/Nemotron-RLHF-GenRM-v1
Creator: nvidia
Published: 2026-03-11 00:22:00
License: 暂无描述

Hugging Face2026-03-11 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/nvidia/Nemotron-RLHF-GenRM-v1

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: - odc-by task_categories: - reinforcement-learning - text-generation configs: - config_name: default data_files: - split: train path: data/train.jsonl --- ## Dataset Description: This dataset is designed to train Generative Reward Models (GenRMs). It leverages reinforcement learning at scale to train accurate and robust GenRMs that generalize better than traditional Bradley-Terry models and reduce the risk of reward hacking. The dataset is composed of: * Preference data focused on diverse domains * A synthetic safety blend The data follows a "meta-prompt" structure where the model is instructed to act as an expert evaluation judge. For GenRM training, each sample includes: 1. **System/User Prompt**: Instructions for the judge, including evaluation criteria and scoring guidelines. 2. **Conversation Context**: The dialogue history and the latest user query. 3. **Responses to be Scored**: Two candidate assistant responses (Response 1 and Response 2). 4. **Evaluation Plan**: Specific rubrics for the current case (e.g., safety, helpfulness, refusal of harmful requests). 5. **Output Format**: Instructions to output a specific JSON format containing analysis, individual scores, and a ranking. The GenRM reasons through the strengths and weaknesses of both responses and produces: * **Individual Helpfulness Score (1-5)**: Higher means more helpful. * **Ranking Score (1-6)**: 1 denotes response 1 is far superior; 6 denotes response 2 is far superior. This dataset is ready for commercial use. ## Dataset Owner(s): NVIDIA Corporation ## Dataset Creation Date: Created on: 12/01/2025 Last Modified on: 12/01/2025 ## License/Terms of Use: This dataset is licensed under the ODC Attribution License (https://opendatacommons.org/licenses/by/1-0/). Additional Information: Contains information from allenai/WildChat-1M which is made available under the ODC Attribution License. ## Intended Usage: This dataset is intended for: 1. Training Generative Reward Models (GenRMs) to reason about response quality and provide granular feedback. 2. Improving model generalization and reducing reward hacking compared to traditional methods. ## Dataset Characterization **Data Collection Method** * Hybrid: Human, Synthetic **Labeling Method** * Human ## Dataset Format Modality: Text Format: JSONL Structure: Each line is a JSON object containing a `messages` list. The user message is a structured prompt for a judge. * **Input**: "You are an expert evaluation judge..." followed by Context, Responses, Plan, and Guidelines. * **Metadata**: Includes `question_id`, `task_name`, `dataset` source, and ground truth `ranking`. ## Dataset Quantification | Subset | Samples | |--------|---------| | train | 299,517 | Total Data Storage: ~5GB ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal developer teams to ensure this dataset meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)

提供机构：

nvidia

5,000+

优质数据集

54 个

任务类型

进入经典数据集