five

NitroGen

收藏
魔搭社区2026-01-08 更新2025-12-27 收录
下载链接:
https://modelscope.cn/datasets/nv-community/NitroGen
下载链接
链接失效反馈
官方服务:
资源简介:
<img src="https://cdn-uploads.huggingface.co/production/uploads/67d8509cb6b70254852d734d/u3VY6_KoT6tEs86YPehU2.gif" style="width:100%; height:auto;" /> <div align="center"> <p style="font-size: 1.2em;"> <a href="https://nitrogen.minedojo.org/"><strong>Website</strong></a> | <a href="https://huggingface.co/nvidia/NitroGen"><strong>Model</strong></a> | <a href="https://huggingface.co/datasets/nvidia/NitroGen"><strong>Dataset</strong></a> | <a href="https://arxiv.org/abs/2601.02427"><strong>Paper</strong></a> </p> </div> # NitroGen Dataset ## Dataset Description: The NitroGen dataset contains action annotations for publicly available gameplay videos. Specifically, we used an in-house model to annotate each video frame with gamepad actions. Note that reproducing results from the NitroGen paper requires additional filtering, such as IDLE frame filtering. This repository is structured as follows: ```bash ├── actions │ ├── SHARD_0000 │ │ ├── <video_id> │ │ │ ├── <video_id>_chunk_0000 │ │ │ │ ├── actions_processed.parquet │ │ │ │ ├── actions_raw.parquet │ │ │ │ └── metadata.json │ │ │ ├── <video_id>_chunk_0001 │ │ │ │ ├── actions_processed.parquet │ │ │ │ ├── actions_raw.parquet │ │ │ │ └── metadata.json │ │ │ ├── ... │ ├── SHARD_0001 │ │ ├── ... │ ├── ... ``` Annotations for each video are split into 20-second chunks. Each chunk directory contains the following files: - `actions_raw.parquet`: this is a table that stores per-frame gamepad actions - `metadata.json`: contains all metadata related to the chunk, such as timestamps, length or url - `actions_processed.parquet` (optional): same format as `actions_raw.parquet` but with quality filtering and remapping applied `metadata.json` contains the following: ```bash { "uuid": "<video_id>_chunk_<chunk_number>_actions", "chunk_id": "<chunk_number>", "chunk_size": int, # number of frames in the chunk "original_video": { "resolution": [1080, 1920], "video_id": "<video_id>", "source": str, "url": str, # chunk start and end timestamps "start_time": float, # in seconds "end_time": float, "duration": float, "start_frame": int, "end_frame": int, }, "game": str, "controller_type": str, # bbox to mask the on-screen controller in pixel space, relative to resolution above "bbox_controller_overlay": [xtl, ytl, w, h], # optional, only if the gameplay is not full screen in the video, relative coordinates in [0, 1] "bbox_game_area": { "xtl": float, "ytl": float, "xbr": float, "ybr": float }, # optional, list of bounding boxes for elements that are not gameplay "bbox_others": [ { "xtl": float, "ytl": float, "xbr": float, "ybr": float }, ... ] } ``` `actions_raw.parquet` and `actions_processed.parquet` are tables containing gamepad actions, one row corresponds to a gamepad state for one frame from the original video. Each row follows a standard gamepad layout, with $17$ boolean columns for buttons and $2$ columns for each joystick, containing pairs of $[-1,1]$ values. Button columns are the following: ```python [ "dpad_down", "dpad_left", "dpad_right", "dpad_up", "left_shoulder", "left_thumb", "left_trigger", "right_shoulder", "right_thumb", "right_trigger", "south", "west", "east", "north", "back", "start", "guide", ] ``` Joystick columns are `j_left` and `j_right`. They contain $x,y$ coordinates in $[-1, 1]$. Note that $(-1,-1)$ is the **top-left** as is standard for joystick axes. This dataset only includes the gamepad action labels. This dataset is for research and development only. ## Dataset Owner(s): NVIDIA Corporation ## Dataset Creation Date: 2025-12-19 ## License/Terms of Use: CC BY-NC 4.0 ## Intended Usage: This dataset is intended for training behavior cloning policies (video to actions) and world models (actions to video) ## Dataset Characterization ** Data Collection Method<br> Automated <br> ** Labeling Method<br> Synthetic <br> ## Dataset Format Tabular, parquet files ## Dataset Quantification Annotated videos: 30k Total number of frames annotated: ~15B ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

<img src="https://cdn-uploads.huggingface.co/production/uploads/67d8509cb6b70254852d734d/u3VY6_KoT6tEs86YPehU2.gif" style="width:100%; height:auto;" /> <div align="center"> <p style="font-size: 1.2em;"> <a href="https://nitrogen.minedojo.org/"><strong>官方网站</strong></a> | <a href="https://huggingface.co/nvidia/NitroGen"><strong>模型</strong></a> | <a href="https://huggingface.co/datasets/nvidia/NitroGen"><strong>数据集</strong></a> | <a href="https://arxiv.org/abs/2601.02427"><strong>论文</strong></a> </p> </div> # NitroGen 数据集 ## 数据集描述: The NitroGen dataset contains action annotations for publicly available gameplay videos. Specifically, we used an in-house model to annotate each video frame with gamepad actions. Note that reproducing results from the NitroGen paper requires additional filtering, such as IDLE frame filtering. 本仓库的目录结构如下: bash ├── actions │ ├── SHARD_0000 │ │ ├── <video_id> │ │ │ ├── <video_id>_chunk_0000 │ │ │ │ ├── actions_processed.parquet │ │ │ │ ├── actions_raw.parquet │ │ │ │ └── metadata.json │ │ │ ├── <video_id>_chunk_0001 │ │ │ │ ├── actions_processed.parquet │ │ │ │ ├── actions_raw.parquet │ │ │ │ └── metadata.json │ │ │ ├── ... │ ├── SHARD_0001 │ │ ├── ... │ ├── ... 每个视频的标注被拆分为20秒的片段(chunk)。每个片段目录包含以下文件: - `actions_raw.parquet`:用于存储逐帧游戏手柄操作的表格文件 - `metadata.json`:包含该片段的所有元数据(metadata),例如时间戳、时长或来源URL - `actions_processed.parquet`(可选):格式与`actions_raw.parquet`一致,但经过了质量过滤与坐标重映射处理 `metadata.json`包含以下字段: bash { "uuid": "<video_id>_chunk_<chunk_number>_actions", "chunk_id": "<chunk_number>", "chunk_size": int, # 片段包含的帧数 "original_video": { "resolution": [1080, 1920], "video_id": "<video_id>", "source": str, "url": str, # 片段的开始与结束时间戳 "start_time": float, # 单位:秒 "end_time": float, "duration": float, "start_frame": int, "end_frame": int, }, "game": str, "controller_type": str, # 屏幕上控制器覆盖区域的边界框(bounding box,简称bbox),以像素空间为单位,相对于上述分辨率 "bbox_controller_overlay": [xtl, ytl, w, h], # 可选字段,仅当视频中游戏画面非全屏时提供,坐标为[0,1]范围内的相对坐标 "bbox_game_area": { "xtl": float, "ytl": float, "xbr": float, "ybr": float }, # 可选字段,非游戏元素的边界框列表 "bbox_others": [ { "xtl": float, "ytl": float, "xbr": float, "ybr": float }, ... ] } `actions_raw.parquet` 与 `actions_processed.parquet` 为存储游戏手柄操作的表格,每一行对应原始视频中某一帧的游戏手柄状态。每一行遵循标准游戏手柄布局,包含17个布尔类型的按钮列,以及每个摇杆(joystick)对应2个列,存储[-1, 1]范围内的数值对。 按钮列依次为: python [ "dpad_down", "dpad_left", "dpad_right", "dpad_up", "left_shoulder", "left_thumb", "left_trigger", "right_shoulder", "right_thumb", "right_trigger", "south", "west", "east", "north", "back", "start", "guide", ] 摇杆列分别为`j_left`与`j_right`,存储[-1, 1]范围内的x、y坐标。需注意,按照摇杆轴标准约定,(-1,-1)代表**左上角**。 本数据集仅包含游戏手柄操作标注,仅可用于研发与学术研究用途。 ## 数据集所有者: NVIDIA Corporation ## 数据集创建日期: 2025-12-19 ## 使用许可条款: CC BY-NC 4.0 ## 预期用途: 本数据集旨在用于训练行为克隆策略(behavior cloning policies,视频到动作)与世界模型(world models,动作到视频) ## 数据集特征 ** 数据采集方式<br> 自动化 <br> ** 标注方式<br> Synthetic <br> ## 数据集格式 Tabular, parquet files ## 数据集量化 Annotated videos: 30k Total number of frames annotated: ~15B ## 伦理考量: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
提供机构:
maas
创建时间:
2025-12-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作