the_matrix_dataset_8M_1920_1080
收藏魔搭社区2025-12-05 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/TheMatrixDataset/the_matrix_dataset_8M_1920_1080
下载链接
链接失效反馈官方服务:
资源简介:
# The Matrix
<div align="center">
<img src="readme_src/white-logo.svg" width="50%" alt="The Matrix logo" />
</div>
<p align="center">
Download The Matrix model weights at
<a href="https://huggingface.co/MatrixTeam/TheMatrix" target="_blank">🤗 Huggingface</a>
or
<a href="https://www.modelscope.cn/models/AiurRuili/TheMatrix" target="_blank">🤖 ModelScope</a>
</p>
<p align="center">
Download The Matrix Dataset at
<a href="https://huggingface.co/MatrixTeam/TheMatrix" target="_blank">🤗 Huggingface</a>
or
<a href="https://www.modelscope.cn/models/AiurRuili/TheMatrix" target="_blank">🤖 ModelScope</a>
</p>
<p align="center">
📚 View the
<a href="https://arxiv.org/abs/2412.03568" target="_blank">Paper</a>,
<a href="https://matrixteam-ai.github.io/pages/TheMatrix/" target="_blank">Website</a>,
and
<a href="http://matrixteam-ai.github.io/docs/TheMatrixDocs" target="_blank">Documentation</a>
</p>
<p align="center">
👋 Say Hi to our team and members at
<a href="https://matrixteam-ai.github.io/" target="_blank">Matrix-Team</a>
</p>
<p align="center">
📍 (Coming Soon) Explore The Matrix playground online at <a href="">Journee</a>.
</p>
<div align="center">
<img src="readme_src/font_30s.png" width="36%" alt="Stylized font banner" />
</div>
# Matrix Dataset
## Dataset Description
**Matrix Dataset** is a large-scale dataset designed for **interactive simulation modeling and vision-based navigation**, covering two representative open-world video game environments: **Forza Horizon 5** and **Cyberpunk 2077**. It includes high-resolution gameplay videos, fine-grained control signals, and 3D positional telemetry, making it suitable for a wide range of tasks including reinforcement learning, world modeling, and video generation.
### Dataset Overview
The dataset consists of two distinct domains:
- **Forza Horizon 5**: The dataset comprises a total of 937,900 video clips at 60 FPS, of which 262,707 include corresponding control signals. The combined duration exceeds 2861.9 hours.
- **Cyberpunk 2077**:
### Supported Tasks
This dataset can be used for:
- World modeling
- Reinforcement and imitation learning
- Video prediction and generation
- Visual navigation and planning
- Multimodal alignment (e.g., image-control signal modeling)
## Documentation
Comprehensive documentation includes detailed installation steps, tutorials, and training instructions. The [paper](https://arxiv.org/abs/2412.03568) and [Project Page](https://matrixteam-ai.github.io/pages/TheMatrix/) offer more details about the method.
## Dataset Generation Details
### Raw Data Sources
- **Forza Horizon 5**:
- **Stage 1** comprises 675,193 video clips (60 FPS) with an average duration of 12.8 seconds and no associated control signals.
- **Stage 2** comprises 208,933 clips (60 FPS) with an average duration of 6.09 seconds and a control signal at every second; on average, each clip’s control-label distribution is D 50.94 %, DR 24.67 %, DL 24.38 %, with 3.04 signal-change events per clip.
- **Stage 3** comprises 53,774 clips (60 FPS) with an average duration of 6.07 seconds and a control signal at every second; on average, each clip’s control-label distribution is D 51.25 %, DR 24.38 %, DL 24.37 %, with 3.02 signal-change events per clip.
- **Cyberpunk 2077**: Coming soon
### Dataset Annotations
#### Annotation Process
- **Forza Horizon 5** annotations are auto-generated using metadata logs synchronized with video and control signals.
- **Cyberpunk 2077** annotations are inferred from predefined scene types and metadata associated with the manual collection process.
#### Annotators
- **Forza Horizon 5** data is fully automated.
- **Cyberpunk 2077** data is collected by trained human annotators through remote access to cloud-hosted servers.
## Citation
If you find our work useful please consider citing:
```bibtex
@article{feng2024matrix,
title={The matrix: Infinite-horizon world generation with real-time moving control},
author={Feng, Ruili and Zhang, Han and Yang, Zhantao and Xiao, Jie and Shu, Zhilei and Liu, Zhiheng and Zheng, Andy and Huang, Yukun and Liu, Yu and Zhang, Hongyang},
journal={arXiv preprint arXiv:2412.03568},
year={2024}
}
```
## License
The Matrix Dataset is released under the Apache 2.0 License.
# 《矩阵》数据集(The Matrix)
<div align="center">
<img src="readme_src/white-logo.svg" width="50%" alt="《矩阵》数据集标志" />
</div>
<p align="center">
可前往<a href="https://huggingface.co/MatrixTeam/TheMatrix" target="_blank">🤗 拥抱脸(Huggingface)</a>或<a href="https://www.modelscope.cn/models/AiurRuili/TheMatrix" target="_blank">🤖 魔搭(ModelScope)</a>下载《矩阵》数据集的模型权重
</p>
<p align="center">
可前往<a href="https://huggingface.co/MatrixTeam/TheMatrix" target="_blank">🤗 拥抱脸(Huggingface)</a>或<a href="https://www.modelscope.cn/models/AiurRuili/TheMatrix" target="_blank">🤖 魔搭(ModelScope)</a>下载《矩阵》数据集本体
</p>
<p align="center">
📚 查看<a href="https://arxiv.org/abs/2412.03568" target="_blank">研究论文</a>、<a href="https://matrixteam-ai.github.io/pages/TheMatrix/" target="_blank">项目官网</a>与<a href="http://matrixteam-ai.github.io/docs/TheMatrixDocs" target="_blank">官方文档</a>
</p>
<p align="center">
👋 前往<a href="https://matrixteam-ai.github.io/" target="_blank">Matrix-Team</a>了解我们的团队与成员
</p>
<p align="center">
📍 (即将上线)可前往<a href="">Journee</a>在线体验《矩阵》数据集的交互式演示平台
</p>
<div align="center">
<img src="readme_src/font_30s.png" width="36%" alt="风格化字体横幅" />
</div>
# 矩阵数据集(Matrix Dataset)
## 数据集描述
**矩阵数据集(Matrix Dataset)** 是一款专为**交互式仿真建模与视觉导航**打造的大规模数据集,覆盖两款极具代表性的开放世界电子游戏环境:**《极限竞速:地平线5(Forza Horizon 5)》**与**《赛博朋克2077(Cyberpunk 2077)》**。该数据集包含高分辨率游戏录像、细粒度控制信号与三维位置遥测数据,可适配强化学习、世界建模、视频生成等诸多任务。
### 数据集概览
本数据集包含两个独立的领域子集:
- **《极限竞速:地平线5(Forza Horizon 5)》**:数据集共包含937,900段60帧每秒的视频片段,其中262,707段带有配套的控制信号,总时长超过2,861.9小时。
- **《赛博朋克2077(Cyberpunk 2077)》**:即将上线
### 支持的任务
本数据集可用于以下任务:
- 世界建模
- 强化学习与模仿学习
- 视频预测与生成
- 视觉导航与规划
- 多模态对齐(例如图像-控制信号建模)
## 官方文档
完整的官方文档包含详细的安装步骤、使用教程与训练指南。相关研究方法的更多细节可参阅[论文](https://arxiv.org/abs/2412.03568)与[项目主页](https://matrixteam-ai.github.io/pages/TheMatrix/)。
## 数据集生成细节
### 原始数据来源
- **《极限竞速:地平线5(Forza Horizon 5)》**:
- **阶段1**:包含675,193段60帧每秒的视频片段,平均时长12.8秒,无配套控制信号。
- **阶段2**:包含208,933段60帧每秒的视频片段,平均时长6.09秒,每秒附带一条控制信号;每段片段的控制标签分布平均为:D占50.94%、DR占24.67%、DL占24.38%,每段片段平均包含3.04次控制信号切换事件。
- **阶段3**:包含53,774段60帧每秒的视频片段,平均时长6.07秒,每秒附带一条控制信号;每段片段的控制标签分布平均为:D占51.25%、DR占24.38%、DL占24.37%,每段片段平均包含3.02次控制信号切换事件。
- **《赛博朋克2077(Cyberpunk 2077)》**:即将上线
### 数据集标注
#### 标注流程
- **《极限竞速:地平线5(Forza Horizon 5)》**的标注通过与视频、控制信号同步的元数据日志自动生成。
- **《赛博朋克2077(Cyberpunk 2077)》**的标注通过预定义场景类型与手动采集流程关联的元数据推断得出。
#### 标注人员
- **《极限竞速:地平线5(Forza Horizon 5)》**的数据采集与标注完全自动化。
- **《赛博朋克2077(Cyberpunk 2077)》**的数据由经过培训的人工标注员通过远程访问云托管服务器完成采集。
## 引用方式
若您认为本工作对您的研究有所帮助,请引用以下文献:
bibtex
@article{feng2024matrix,
title={The matrix: Infinite-horizon world generation with real-time moving control},
author={Feng, Ruili and Zhang, Han and Yang, Zhantao and Xiao, Jie and Shu, Zhilei and Liu, Zhiheng and Zheng, Andy and Huang, Yukun and Liu, Yu and Zhang, Hongyang},
journal={arXiv preprint arXiv:2412.03568},
year={2024}
}
## 授权协议
本矩阵数据集采用Apache 2.0开源协议发布。
提供机构:
maas
创建时间:
2025-02-13
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个大规模交互式模拟建模与视觉导航数据集,包含《极限竞速:地平线5》和《赛博朋克2077》两个开放世界游戏环境的高分辨率视频、控制信号及3D位置遥测数据。它适用于强化学习、世界建模、视频生成等多种任务。
以上内容由遇见数据集搜集并总结生成



