Cosmos-Reason1-Benchmark

Name: Cosmos-Reason1-Benchmark
Creator: maas
Published: 2025-12-04 09:19:27
License: 暂无描述

魔搭社区2025-12-04 更新2025-05-24 收录

下载链接：

https://modelscope.cn/datasets/nv-community/Cosmos-Reason1-Benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

## Dataset Description: The data format is a pair of video and text annotations. We summarize the data and annotations in Table 4 (SFT), Table 5 (RL), and Table 6 (Benchmark) of the Cosmos-Reason1 paper. We release the annotations for embodied reasoning tasks for BridgeDatav2, RoboVQA, Agibot, HoloAssist, AV, and the videos for the RoboVQA and AV datasets. We additionally release the annotations and videos for the RoboFail dataset for benchmarks. By releasing the dataset, NVIDIA supports the development of open embodied reasoning models and provides benchmarks to evaluate the progress. This dataset is ready for commercial/non-commercial use. ## Dataset Owner(s): NVIDIA Corporation ## Dataset Creation Date: 2025/05/17 ## License/Terms of Use: The use of this dataset is governed by [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Additional Information: [Apache License 2.0](https://github.com/google-deepmind/robovqa/blob/main/LICENSE); [MIT](https://github.com/real-stanford/reflect/blob/main/LICENSE). ## Intended Usage: This dataset is intended to demonstrate and facilitate understanding and usage of the Cosmos-Reason1 models. It should primarily be used for educational and demonstration purposes. ## Dataset Characterization The embodied reasoning datasets and benchmarks focus on the following areas: robotics (RoboVQA, BridgeDataV2, Agibot, RobFail), ego-centric human demonstration (HoloAssist), and Autonomous Vehicle (AV) driving video data. **The AV data is currently unavailable and will be uploaded soon!** **Data Collection Method**: * RoboVQA: Hybrid: Automatic/Sensors * BridgeDataV2: Automatic/Sensors * AgiBot: Automatic/Sensors * RoboFail: Automatic/Sensors * HoloAssist: Human * AV: Automatic/Sensors **Labeling Method**: * RoboVQA: Hybrid: Human,Automated * BridgeDataV2: Hybrid: Human,Automated * AgiBot: Hybrid: Human,Automated * RoboFail: Hybrid: Human,Automated * HoloAssist: Hybrid: Human,Automated * AV: Hybrid: Human,Automated ## Dataset Format * Modality: Video (mp4) and Text ## Dataset Quantification We release the embodied reasoning data and benchmarks. Each data sample is a pair of video and text. The text annotations include understanding and reasoning annotations described in the Cosmos-Reason1 paper. Each video may have multiple text annotations. The quantity of the video and text pairs is described in the table below. | Dataset | SFT Data | RL Data | Benchmark Data | |--------------|---------:|--------:|---------------:| | [RoboVQA](https://robovqa.github.io/) | 1.14m | 252 | 110 | | AV | 24.7k | 200 | 100 | | [BridgeDataV2](https://rail-berkeley.github.io/bridgedata/) | 258k | 240 | 100 | | [Agibot](https://github.com/OpenDriveLab/AgiBot-World) | 38.9k | 200 | 100 | | [HoloAssist](https://holoassist.github.io/) | 273k | 200 | 100 | | [RoboFail](https://robot-reflect.github.io/) | N/A | N/A | 100 | | **Total Storage Size** | **300.6GB** | **2.6GB** | **1.5GB** | | We release text annotations for all embodied reasoning datasets and videos for RoboVQA and AV datasets. For other datasets, users may download the source videos from the original data source and find corresponding video sources via the video names. The held-out RoboFail benchmark is released for measuring the generalization capability. ## Reference(s): [[2503.15558] Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning](https://arxiv.org/abs/2503.15558) ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

## 数据集描述本数据集采用视频与文本标注对的格式。相关数据与标注的统计信息已整理于《Cosmos-Reason1》论文的表4（监督微调（SFT））、表5（强化学习（RL））以及表6（基准测试）中。本次发布涵盖BridgeDataV2、RoboVQA、Agibot、HoloAssist、自动驾驶（AV）等具身推理（embodied reasoning）任务的标注数据，以及RoboVQA与AV数据集的视频文件；同时还发布了用于基准测试的RoboFail数据集的标注与视频文件。通过本数据集的开源，NVIDIA助力开放具身推理模型的研发，并提供用于评估模型进展的基准测试集。本数据集可用于商业与非商业用途。 ## 数据集所有者 NVIDIA公司 ## 数据集创建日期 2025/05/17 ## 使用许可条款本数据集的使用遵循[CC-BY-4.0协议](https://creativecommons.org/licenses/by/4.0/deed.en)。额外许可信息包括：[Apache许可证2.0](https://github.com/google-deepmind/robovqa/blob/main/LICENSE)与[MIT许可证](https://github.com/real-stanford/reflect/blob/main/LICENSE)。 ## 预期用途本数据集旨在演示并帮助理解、使用Cosmos-Reason1系列模型，主要用于教学与演示场景。 ## 数据集特征本次发布的具身推理数据集与基准测试集涵盖以下方向：机器人学领域（RoboVQA、BridgeDataV2、Agibot、RoboFail）、第一人称视角人类演示（HoloAssist），以及自动驾驶（AV）行驶视频数据。 **注意：AV数据集当前暂未上线，即将后续上传！** **数据采集方式**： * RoboVQA：混合模式：自动采集+传感器采集 * BridgeDataV2：自动采集+传感器采集 * AgiBot：自动采集+传感器采集 * RoboFail：自动采集+传感器采集 * HoloAssist：人工采集 * AV：自动采集+传感器采集 **标注方式**： * RoboVQA：混合模式：人工标注+自动标注 * BridgeDataV2：混合模式：人工标注+自动标注 * AgiBot：混合模式：人工标注+自动标注 * RoboFail：混合模式：人工标注+自动标注 * HoloAssist：混合模式：人工标注+自动标注 * AV：混合模式：人工标注+自动标注 ## 数据集格式 * 模态类型：视频（mp4格式）与文本 ## 数据集量化统计本次发布的具身推理数据与基准测试集均采用“视频-文本对”作为单一样本形式，文本标注涵盖《Cosmos-Reason1》论文中所述的理解与推理类标注。每个视频可对应多条文本标注。各数据集的视频-文本对数量统计如下表所示。 | 数据集名称 | 监督微调（SFT）数据量 | 强化学习（RL）数据量 | 基准测试数据量 | |--------------|---------:|--------:|---------------:| | [RoboVQA](https://robovqa.github.io/) | 114万 | 252 | 110 | | AV | 2.47万 | 200 | 100 | | [BridgeDataV2](https://rail-berkeley.github.io/bridgedata/) | 25.8万 | 240 | 100 | | [Agibot](https://github.com/OpenDriveLab/AgiBot-World) | 3.89万 | 200 | 100 | | [HoloAssist](https://holoassist.github.io/) | 27.3万 | 200 | 100 | | [RoboFail](https://robot-reflect.github.io/) | 不适用（N/A） | 不适用（N/A） | 100 | | **总存储容量** | **300.6GB** | **2.6GB** | **1.5GB** | | 本次发布了所有具身推理数据集的文本标注，以及RoboVQA与AV数据集的视频文件。其余数据集的视频文件需用户自行从原始数据源下载，可通过视频文件名匹配对应的数据源。本次发布的预留RoboFail基准测试集用于评估模型的泛化能力。 ## 参考文献 [[2503.15558] Cosmos-Reason1：从物理常识到具身推理](https://arxiv.org/abs/2503.15558) ## 伦理考量 NVIDIA认为，可信AI是一项共同责任，我们已制定相关政策与实践规范，以支持各类AI应用的研发。开发者在遵循本服务条款下载或使用本数据集时，应与内部模型团队协作，确保所研发的模型符合相关行业与应用场景的要求，并防范潜在的产品误用风险。如需报告安全漏洞或提交NVIDIA AI相关问题，请访问[此链接](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

提供机构：

maas

创建时间：

2025-05-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集