Open-Sora-Plan-v1.2.0
收藏魔搭社区2025-12-05 更新2025-06-07 收录
下载链接:
https://modelscope.cn/datasets/PKU-YuanLab/Open-Sora-Plan-v1.2.0
下载链接
链接失效反馈官方服务:
资源简介:
# 10M SAM
The original json was obtained from [v1.1.0](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main/anno_jsons), just with the RESOLUTION information added.
The format of image annotation file is as follows.
```
[
{
"path": "00168/001680102.jpg",
"cap": [
"xxxxx."
],
"resolution": {
"height": 512,
"width": 683
}
},
...
]
```
# 6M HQ Panda70m
The format of video annotation file is as follows. Each element's path follows the structure: `part_x/youtube_id/youtube_id_segment_i.mp4`.
Here, `part_x` is our custom organizational folder, which can be customed according to your download path.
The `youtube_id` and `segment_i` can be obtained from the [original annotation file](https://github.com/snap-research/Panda-70M/tree/main/dataset_dataloading).
```
[
{
"path": "panda70m_part_5565/qLqjjDhhD5Q/qLqjjDhhD5Q_segment_0.mp4",
"cap": [
"A man and a woman are sitting down on a news anchor talking to each other."
],
"resolution": {
"height": 720,
"width": 1280
},
"fps": 29.97002997002997,
"duration": 11.444767
},
...
]
```
# 100k HQ data
The original data was obtained from [v1.1.0](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main). We reorganized captions.
# 10M SAM
本数据集的原始JSON文件取自[v1.1.0版本](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main/anno_jsons),仅额外补充了分辨率(RESOLUTION)信息。
图像标注文件的格式如下所示。
[
{
"path": "00168/001680102.jpg",
"cap": [
"xxxxx."
],
"resolution": {
"height": 512,
"width": 683
}
},
...
]
# 6M 高清Panda70m
视频标注文件的格式如下所示。每个条目的路径遵循如下结构:`part_x/youtube_id/youtube_id_segment_i.mp4`。其中,`part_x`为自定义的组织文件夹,可根据您的下载路径自行调整;`youtube_id`与`segment_i`可从[原始标注文件](https://github.com/snap-research/Panda-70M/tree/main/dataset_dataloading)中获取。
[
{
"path": "panda70m_part_5565/qLqjjDhhD5Q/qLqjjDhhD5Q_segment_0.mp4",
"cap": [
"A man and a woman are sitting down on a news anchor talking to each other."
],
"resolution": {
"height": 720,
"width": 1280
},
"fps": 29.97002997002997,
"duration": 11.444767
},
...
]
# 100k 高清数据集
本数据集的原始数据取自[v1.1.0版本](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main),我们对字幕进行了重新整理。
提供机构:
maas
创建时间:
2025-06-05



