AIGVQA-DB|AI生成视频数据集|视频质量评估数据集

arXiv2024-11-26 更新2024-11-29 收录

AI生成视频

视频质量评估

下载链接：

https://github.com/wangjiarui153/AIGV-Assessor

下载链接

链接失效反馈

资源简介：

AIGVQA-DB是由上海交通大学图像通信与网络工程研究所创建的大规模数据集，包含36,576个由15种先进文本到视频生成模型生成的AI生成视频。数据集通过1,048个多样化的提示生成，并经过系统注释流程，收集了370,000个专家评分。创建过程包括视频生成、注释和评分，旨在解决AI生成视频的感知质量评估问题，特别是在不真实物体、不自然运动和视觉元素不一致等独特失真方面。该数据集的应用领域广泛，包括娱乐、艺术、设计和广告等，旨在提高视频质量评估的准确性和全面性。

提供机构：

上海交通大学图像通信与网络工程研究所

创建时间：

2024-11-26

AI搜集汇总

数据集介绍

构建方式

AIGVQA-DB is meticulously constructed to address the unique challenges posed by AI-generated videos (AIGVs). The dataset comprises 36,576 AIGVs, each generated by 15 advanced text-to-video models using 1,048 diverse prompts. A systematic annotation pipeline is devised, incorporating both scoring and ranking processes to collect 370k expert ratings. This comprehensive approach ensures a robust evaluation of perceptual quality, capturing intricate details such as unrealistic objects, unnatural movements, and inconsistent visual elements.

使用方法

Researchers and practitioners can leverage AIGVQA-DB to benchmark and evaluate text-to-video generation models. The dataset supports various analyses, including quality regression tasks and pairwise preference comparisons. By utilizing the provided expert ratings and model outputs, users can develop and fine-tune VQA models, enhancing their ability to predict precise video quality scores and accurately assess video pair preferences. The dataset is publicly available, fostering collaborative research and innovation in the field of AI-generated video quality assessment.

背景与挑战

背景概述

AIGVQA-DB, a large-scale dataset comprising 36,576 AI-generated videos (AIGVs) annotated with MOS scores and pairwise comparisons, was introduced by researchers from Shanghai Jiao Tong University. This dataset was created to address the pressing need for effective video quality assessment (VQA) models specifically designed for AIGVs. The core research problem revolves around accurately assessing the perceptual quality of AIGVs, which often suffer from unique distortions such as unrealistic objects, unnatural movements, or inconsistent visual elements. The dataset's creation involved generating videos using 15 advanced text-to-video models and 1,048 diverse prompts, followed by a systematic annotation pipeline that collected 370k expert ratings. AIGVQA-DB has significantly impacted the field by providing a comprehensive benchmark for evaluating the capabilities of text-to-video models from multiple perspectives.

当前挑战

The primary challenge addressed by AIGVQA-DB is the accurate assessment of perceptual quality in AI-generated videos, which often exhibit unique distortions not seen in natural videos. Traditional VQA methods struggle with these specific distortions, such as spatial artifacts, temporal inconsistencies, and misalignment between generated content and text prompts. Additionally, the dataset's construction faced challenges in generating a diverse set of high-quality videos and ensuring the reliability of expert ratings through a systematic annotation pipeline. The dataset also highlights the need for more comprehensive metrics that can reflect human preferences for individual videos, beyond traditional fidelity-based evaluations. This necessitates the development of novel VQA models that can capture intricate quality attributes and provide accurate and robust quality assessments.

常用场景

经典使用场景

AIGVQA-DB 数据集的经典使用场景主要集中在评估和改进文本到视频生成模型的感知质量。通过提供大规模的 AI 生成视频数据集，研究人员可以开发和验证新的视频质量评估模型，特别是针对 AI 生成视频中常见的失真类型，如不真实的物体、不自然的运动或视觉元素不一致等问题。此外，该数据集还可用于训练和测试基于时空特征和大型多模态模型（LMM）的评估方法，以捕捉 AI 生成视频的复杂质量属性。

解决学术问题

AIGVQA-DB 数据集解决了当前视频质量评估模型在评估 AI 生成视频时面临的常见学术问题。传统视频质量评估方法主要针对专业生成内容（PGC）和用户生成内容（UGC），难以处理 AI 生成视频中的独特失真。该数据集通过提供大规模的 AI 生成视频及其专家评分，促进了更全面和精确的评估模型的开发，从而推动了视频质量评估领域的研究进展。

实际应用

AIGVQA-DB 数据集在实际应用中具有广泛的应用场景。例如，在娱乐、艺术、设计和广告等领域，内容创作者可以使用该数据集来评估和改进其 AI 生成视频的质量。此外，视频平台和社交媒体可以利用该数据集来筛选和推荐高质量的 AI 生成视频，提升用户体验。该数据集还可用于开发和优化视频编辑工具，帮助用户生成更高质量的视频内容。

数据集最近研究