PPTAnimation_Test
收藏PPTAnimation_Test 数据集概述
1. 基本信息
- 许可证: Apache License 2.0
- 标签: PPTAnimation, PowerPoint, video-caption, slides-animation, Vision-Language, synthetic
2. 数据模态与任务
- 多模态任务: Video-Captioning
- 分辨率: 1280 x 720 (720p)
- 编码格式: mp4
- 语言: 英语 (monolingual)
3. 数据集内容
- 视频数量: 1,000 个合成短视频 (每个 < 15 秒)
- 视频格式: MP4 文件 (Videos/video_0001.mp4 ... video_1000.mp4)
- 字幕格式: 纯文本文件 (Captions/video_0001.txt ... video_1000.txt)
4. 目录结构
bash PPTAnimation_Test/ ├── Videos/ │ ├── video_0001.mp4 │ ├── video_0002.mp4 │ └── ... └── Captions/ ├── video_0001.txt ├── video_0002.txt └── ...
5. 任务与应用
- 视觉-语言对齐: 视频-文本检索, 跨模态理解
- 视频字幕生成: 从动画视频生成文本描述
- VLM 微调/基准测试: 评估模型理解 PPT 动画的能力
6. 引用信息
bibtex @misc{jiang2025animationneedsattentionholistic, title = {Animation Needs Attention: A Holistic Approach to Slides Animation Comprehension with Visual-Language Models}, author = {Yifan Jiang and Yibo Xue and Yukun Kang and Pin Zheng and Jian Peng and Feiran Wu and Changliang Xu}, year = {2025}, eprint = {2507.03916}, archivePrefix= {arXiv}, primaryClass = {cs.AI}, url = {https://arxiv.org/abs/2507.03916}, }
7. 使用示例
python from modelscope.msdatasets import MsDataset
dataset = MsDataset.load( dataset_name=jyf9774/PPTAnimation_Test, namespace=jyf9774, split=train # no official split; use train or None )
sample = dataset[0] print(sample[text]) # Caption text sample[video].display() # Preview the video in a notebook




