five

nvidia/Nemotron-RL-knowledge-openqa

收藏
Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/Nemotron-RL-knowledge-openqa
下载链接
链接失效反馈
官方服务:
资源简介:
Nemotron-RL-knowledge-openQA是一个多领域的合成数据集,包含基于知识的问答对。它来源于非结构化数据如书籍和文章,涵盖物理、生物、数学、计算机科学、工程、化学、法律等多个领域。该数据集是NVIDIA NeMo Gym框架的一部分,用于训练大型语言模型的强化学习环境。数据集格式为纯文本,与NeMo Gym兼容,包含135987个(问题,答案)元组,总存储量为50.5MB。数据集所有者是NVIDIA Corporation,创建日期为2025年10月10日,许可证为CC-BY 4.0,可用于商业用途。

The Nemotron-RL-knowledge-openQA is a multi-domain synthetic dataset containing knowledge based questions. It is built from unstructured sources such as books and articles and consists of question–answer pairs requiring short responses. The dataset covers a wide range of domains, including physics, biology, mathematics, computer science, engineering, chemistry, law, and others. This dataset is released as part of NVIDIA NeMo Gym, a framework for building reinforcement learning environments to train large language models. The dataset is text only and compatible with NeMo Gym, containing 135987 tuples of (question, answer) with a total data storage of 50.5MB. The dataset owner is NVIDIA Corporation, created on Oct 10, 2025, licensed under CC-BY 4.0, and ready for commercial use.
提供机构:
nvidia
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作