FETV
收藏arXiv2023-12-26 更新2024-06-21 收录
下载链接:
https://github.com/llyx97/FETV
下载链接
链接失效反馈官方服务:
资源简介:
FETV数据集是由北京大学多媒体信息处理国家重点实验室和华为诺亚方舟实验室合作开发的,用于细粒度评估开放领域文本到视频生成模型的基准。该数据集包含619个文本提示,这些提示根据内容、控制属性和提示复杂性三个正交方面进行分类。FETV不仅考虑了视频生成的空间内容,还特别引入了时间类别以适应视频生成的时间信息。数据集通过两步标注过程构建,首先自动分配类别标签,随后进行人工审核。FETV数据集的应用领域主要集中在评估和改进文本到视频生成模型的性能,特别是在解决现有自动评估指标与人类标准不一致的问题上。
The FETV dataset is a benchmark jointly developed by the National Key Laboratory of Multimedia Information Processing at Peking University and Huawei Noah's Ark Lab, designed for fine-grained evaluation of open-domain text-to-video generation models. This dataset contains 619 text prompts, which are categorized along three orthogonal aspects: content, control attributes, and prompt complexity. Beyond focusing on the spatial content of video generation, FETV specially introduces temporal categories to accommodate the temporal information required for video generation. The dataset is constructed via a two-step annotation process: category labels are first assigned automatically, followed by manual review. The main application scenarios of the FETV dataset center on evaluating and improving the performance of text-to-video generation models, particularly in addressing the inconsistency between existing automatic evaluation metrics and human-centric standards.
提供机构:
北京大学多媒体信息处理国家重点实验室
创建时间:
2023-11-03



