Voxel51/qualcomm-interactive-video-dataset
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Voxel51/qualcomm-interactive-video-dataset
下载链接
链接失效反馈官方服务:
资源简介:
QIVD(高通交互式视频数据集)是一个全面的视频问答数据集,旨在评估多模态AI模型在理解和推理视频内容方面的能力。数据集包含2,900个视频样本,每个样本配有相关问题、详细答案、简短答案以及时间戳,指示答案在视频中的位置。数据集涵盖13个不同的视频理解任务类别,包括对象引用、动作检测、对象属性、动作计数、对象计数等。数据集由高通AI研究团队策划,用于训练和评估视频问答模型,支持多类别理解和时间推理任务。
QIVD (Qualcomm Interactive Video Dataset) is a comprehensive video question-answering dataset designed for evaluating multimodal AI models on their ability to understand and reason about video content. The dataset contains 2,900 video samples with associated questions, answers, and temporal annotations. Each sample includes a question about the video content, a detailed answer, a short answer, and a timestamp indicating when the answer can be found in the video. The dataset covers 13 distinct categories of video understanding tasks, including object referencing, action detection, object attributes, action counting, object counting, and more specialized tasks like audio-visual reasoning and OCR in videos. It is curated by Qualcomm AI Research for training and evaluating VideoQA models, supporting multi-category understanding and temporal reasoning tasks.
提供机构:
Voxel51



