nvidia/Nemotron-RL-knowledge-openqa
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/Nemotron-RL-knowledge-openqa
下载链接
链接失效反馈官方服务:
资源简介:
Nemotron-RL-knowledge-openQA是一个多领域的合成数据集,包含基于知识的问答对。它来源于非结构化数据如书籍和文章,涵盖物理、生物、数学、计算机科学、工程、化学、法律等多个领域。该数据集是NVIDIA NeMo Gym框架的一部分,用于训练大型语言模型的强化学习环境。数据集格式为纯文本,与NeMo Gym兼容,包含135987个(问题,答案)元组,总存储量为50.5MB。数据集所有者是NVIDIA Corporation,创建日期为2025年10月10日,许可证为CC-BY 4.0,可用于商业用途。
The Nemotron-RL-knowledge-openQA is a multi-domain synthetic dataset containing knowledge based questions. It is built from unstructured sources such as books and articles and consists of question–answer pairs requiring short responses. The dataset covers a wide range of domains, including physics, biology, mathematics, computer science, engineering, chemistry, law, and others. This dataset is released as part of NVIDIA NeMo Gym, a framework for building reinforcement learning environments to train large language models. The dataset is text only and compatible with NeMo Gym, containing 135987 tuples of (question, answer) with a total data storage of 50.5MB. The dataset owner is NVIDIA Corporation, created on Oct 10, 2025, licensed under CC-BY 4.0, and ready for commercial use.
提供机构:
nvidia



