five

full-modality-data

收藏
魔搭社区2025-12-03 更新2025-08-09 收录
下载链接:
https://modelscope.cn/datasets/lmms-lab/full-modality-data
下载链接
链接失效反馈
官方服务:
资源简介:
# Full Modality Dataset Statistics ## Video Statistics - **Total Videos**: 28,472 - **Total Duration**: 1422.33 hours - **Average Duration**: 179.84 seconds - **Median Duration**: 160.08 seconds - **Duration Range**: 10.04s - 1780.03s ## QA Statistics - **Total Questions**: 1,444,526 - **Average Questions per Video**: 50.7 - **Questions per Video Range**: 14 - 450 ## Question Type Distribution - **OE**: 1,444,526 (100.0%) ## Question Category Distribution - **temporal**: 96,873 (6.7%) - **causal**: 96,873 (6.7%) - **description_scene**: 96,873 (6.7%) - **description_human**: 96,873 (6.7%) - **description_object**: 96,873 (6.7%) - **binary**: 96,873 (6.7%) - **fine_grained_action_understanding**: 96,873 (6.7%) - **plot_understanding**: 96,873 (6.7%) - **non_existent_actions**: 96,873 (6.7%) - **time_order_understanding**: 96,873 (6.7%) - **attribute_change**: 96,873 (6.7%) - **audio_visual_dialogue_consistency**: 96,873 (6.7%) - **audio_visual_subtext**: 96,873 (6.7%) - **audio_visual_mood**: 96,873 (6.7%) - **spatial_reasoning**: 88,304 (6.1%) ## Dataset Description This dataset contains multimodal video question-answering pairs that require both visual and audio information to answer correctly. The questions span multiple categories including temporal reasoning, causal analysis, scene description, and more. All questions are open-ended format. ## Dataset Structure The dataset contains the following columns: - `video_id`: Unique identifier for the video - `video_filename`: Original filename of the video - `video_duration`: Duration of the video in seconds - `video_size_mb`: Size of the video file in MB - `segment`: Time segment within the video (format: start_time-end_time) - `category`: Question category (e.g., temporal, causal, description_scene, etc.) - `question`: The question text (open-ended format) - `answer`: The correct answer ## Usage ```python from datasets import load_dataset dataset = load_dataset("ngqtrung/full-modality-data") # Filter by category temporal_questions = dataset.filter(lambda x: x['category'] == 'temporal') causal_questions = dataset.filter(lambda x: x['category'] == 'causal') # Get unique categories categories = set(dataset['category']) print(f"Available categories: {categories}") ```

# 全模态数据集统计(Full Modality Dataset Statistics) ## 视频统计 - **总视频数**:28,472 - **总时长**:1422.33 小时 - **平均时长**:179.84 秒 - **时长中位数**:160.08 秒 - **时长区间**:10.04 秒 至 1780.03 秒 ## 问答(Question Answering, QA)统计 - **总问题数**:1,444,526 - **单视频平均问题数**:50.7 - **单视频问题数区间**:14 至 450 ## 问题类型分布 - **开放式问题(Open-ended, OE)**:1,444,526(占比100.0%) ## 问题类别分布 - **时间推理类(temporal)**:96,873(占比6.7%) - **因果推理类(causal)**:96,873(占比6.7%) - **场景描述类(description_scene)**:96,873(占比6.7%) - **人体描述类(description_human)**:96,873(占比6.7%) - **物体描述类(description_object)**:96,873(占比6.7%) - **二分类问题类(binary)**:96,873(占比6.7%) - **细粒度动作理解类(fine_grained_action_understanding)**:96,873(占比6.7%) - **情节理解类(plot_understanding)**:96,873(占比6.7%) - **非存在动作类(non_existent_actions)**:96,873(占比6.7%) - **时间顺序理解类(time_order_understanding)**:96,873(占比6.7%) - **属性变化类(attribute_change)**:96,873(占比6.7%) - **音画对话一致性类(audio_visual_dialogue_consistency)**:96,873(占比6.7%) - **音画潜台词类(audio_visual_subtext)**:96,873(占比6.7%) - **音画情绪类(audio_visual_mood)**:96,873(占比6.7%) - **空间推理类(spatial_reasoning)**:88,304(占比6.1%) ## 数据集描述 本数据集包含多模态视频问答对,需同时结合视觉与音频信息方可正确作答。问题涵盖时间推理、因果分析、场景描述等多个类别,所有问题均采用开放式格式。 ## 数据集结构 数据集包含以下字段: - `video_id`:视频的唯一标识符 - `video_filename`:视频的原始文件名 - `video_duration`:视频时长,单位为秒 - `video_size_mb`:视频文件大小,单位为MB - `segment`:视频内的时间片段(格式:开始时间-结束时间) - `category`:问题类别(例如:temporal、causal、description_scene等) - `question`:问题文本(开放式格式) - `answer`:正确答案 ## 使用方法 python from datasets import load_dataset dataset = load_dataset("ngqtrung/full-modality-data") # 按类别筛选 temporal_questions = dataset.filter(lambda x: x['category'] == 'temporal') causal_questions = dataset.filter(lambda x: x['category'] == 'causal') # 获取所有唯一类别 categories = set(dataset['category']) print(f"可用类别:{categories}")
提供机构:
maas
创建时间:
2025-08-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作