Context-as-Memory-Dataset

Name: Context-as-Memory-Dataset
Creator: maas
Published: 2025-11-27 16:52:12
License: 暂无描述

魔搭社区2025-11-27 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/KwaiVGI/Context-as-Memory-Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

<div align="center"> <h1>Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval</h1> <h1>SIGGRAPH Asia 2025</h1> <p> <a href="https://context-as-memory.github.io/">[Project page]</a> <a href="https://arxiv.org/pdf/2506.03141">[ArXiv]</a> <a href="https://huggingface.co/datasets/KwaiVGI/Context-as-Memory-Dataset">[Dataset]</a> </p> </div> # File Structure To prepare the dataset for use, merge the parts into a single zip file using the following command: ```bash cat Context-as-Memory-Dataset_* > Context-as-Memory-Dataset.zip ``` After extracting `Context-as-Memory-Dataset.zip`, the dataset will be organized as follows: ``` Context-as-Memory-Dataset ├── frames │ ├── AncientTempleEnv_0 │ │ ├── 0000.png │ │ ├── 0001.png │ │ ├── 0002.png │ │ └── ... │ ├── AncientTempleEnv_1 │ │ ├── 0000.png │ │ ├── 0001.png │ │ ├── 0002.png │ │ └── ... │ └── ... │ ├── jsons │ ├── AncientTempleEnv_0.json │ ├── AncientTempleEnv_1.json │ └── ... │ ├── overlap_labels │ ├── AncientTempleEnv_0 │ │ ├── 0.json │ │ ├── 1.json │ │ ├── 2.json │ │ └── ... │ ├── AncientTempleEnv_1 │ │ ├── 0.json │ │ ├── 1.json │ │ ├── 2.json │ │ └── ... │ └── ... │ └── captions.txt ``` # Explanation of Dataset Parts - **`frames/`**: 100 subdirectories, each containing 7,601 video frame images. - **`jsons/`**: 100 JSON files, each storing the camera pose (position + rotation) of every frame in the corresponding long video. - **`overlap_labels/`**: 100 subdirectories, each containing 7,601 JSON files, where each file records the indices of overlapping frames corresponding to that frame. - **`captions.txt`**: Captions annotated for a segment of a long video, from a given starting frame to an ending frame. - We also provide a simple code file, `tools.py`, which can convert (x, y, z, yaw, pitch) into RT, and can also select a specific frame as the reference frame to align the RT of other frames to its coordinate system.

<div align="center"> <h1>以上下文为记忆：结合记忆检索的场景一致性交互式长视频生成</h1> <h1>SIGGRAPH Asia 2025</h1> <p> <a href="https://context-as-memory.github.io/">[项目主页]</a> <a href="https://arxiv.org/pdf/2506.03141">[ArXiv预印本]</a> <a href="https://huggingface.co/datasets/KwaiVGI/Context-as-Memory-Dataset">[数据集]</a> </p> </div> # 文件结构如需使用该数据集，请通过以下命令将分卷文件合并为单个压缩包： bash cat Context-as-Memory-Dataset_* > Context-as-Memory-Dataset.zip 解压`Context-as-Memory-Dataset.zip`后，数据集的目录结构如下： Context-as-Memory-Dataset ├── frames │ ├── AncientTempleEnv_0 │ │ ├── 0000.png │ │ ├── 0001.png │ │ ├── 0002.png │ │ └── ... │ ├── AncientTempleEnv_1 │ │ ├── 0000.png │ │ ├── 0001.png │ │ ├── 0002.png │ │ └── ... │ └── ... │ ├── jsons │ ├── AncientTempleEnv_0.json │ ├── AncientTempleEnv_1.json │ └── ... │ ├── overlap_labels │ ├── AncientTempleEnv_0 │ │ ├── 0.json │ │ ├── 1.json │ │ ├── 2.json │ │ └── ... │ ├── AncientTempleEnv_1 │ │ ├── 0.json │ │ ├── 1.json │ │ ├── 2.json │ │ └── ... │ └── ... │ └── captions.txt # 数据集各部分说明 - **`frames/`**：包含100个子目录，每个子目录内存储7601张视频帧图像。 - **`jsons/`**：包含100个JSON文件，每个文件存储对应长视频中每一帧的相机位姿（位置+旋转参数）。 - **`overlap_labels/`**：包含100个子目录，每个子目录内包含7601个JSON文件，每个文件记录当前帧对应的重叠帧索引。 - **`captions.txt`**：针对长视频片段（从指定起始帧至结束帧）标注的字幕文本。 - 此外还提供了简易代码文件`tools.py`，可将(x, y, z, 偏航角, 俯仰角)转换为RT矩阵，同时支持选取指定帧作为参考帧，将其他帧的RT矩阵对齐至该参考帧的坐标系下。

提供机构：

maas

创建时间：

2025-10-09

搜集汇总

数据集介绍