five

ngqtrung/raw_set_3372

收藏
Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/ngqtrung/raw_set_3372
下载链接
链接失效反馈
官方服务:
资源简介:
# Raw Video Dataset (3372 Videos) This dataset contains 3372 videos with both single-modality and cross-modality question-answering annotations. ## Dataset Structure ``` train/ ├── metadata.json # Combined metadata with all annotations ├── videos_part_001.tar # Video files (part 1) ├── videos_part_002.tar # Video files (part 2) └── ... # Additional parts (~5GB each) ``` ## Metadata Format The `metadata.json` file contains: ```json { "video_id": { "video_path": "video_id.mp4", "single_modality": { "vision_only": { "question": "...", "choices": {}, "correct_answer": "..." }, "vision_only_misleading": { ... }, "audio_only": { ... }, "audio_only_misleading": { ... } }, "cross_modality": { "task0": { "variant_type": "default", "question": "...", ... }, "task1": { "variant_type": "audio_misleading", ... }, "task2": { "variant_type": "visual_misleading", ... } } } } ``` ## Question Types ### Single Modality - **vision_only**: Questions about visual content only - **vision_only_misleading**: Vision questions with misleading visual information - **audio_only**: Questions about audio content only - **audio_only_misleading**: Audio questions with misleading audio information ### Cross Modality - **default**: Questions requiring both audio and visual understanding - **audio_misleading**: Cross-modal questions with misleading audio - **visual_misleading**: Cross-modal questions with misleading visuals ## Options - Questions include options A, B, C, D - Option E: "Vision details are wrong" (for vision questions) or "Audio details are wrong" (for audio questions) - Option F: "Audio details are wrong" (only for cross-modality questions) ## Usage ### Extract Videos ```bash # Extract all tar files for tar_file in train/videos_part_*.tar; do tar -xf "$tar_file" -C videos/ done ``` ### Load Metadata ```python import json with open('train/metadata.json', 'r') as f: metadata = json.load(f) # Access data for a video video_id = "example_video_id" video_data = metadata[video_id] print(video_data['single_modality']['vision_only']['question']) ``` ## Statistics - **Total Videos**: 3372 - **Total Tar Files**: 18 - **Single Modality Questions**: 13488 - **Cross Modality Questions**: 10116 ## Citation If you use this dataset, please cite appropriately. ## License Please check the original video sources for licensing information.
提供机构:
ngqtrung
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作