RAD (RELEVANCE AND DIVERSITY DATASET)
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/RAD
下载链接
链接失效反馈官方服务:
资源简介:
该数据集对于查询自适应视频摘要很有用,并带有多样性和特定于查询的相关标签进行注释。尽管自动视频摘要的问题最近受到了很多关注,但创建视频摘要的问题也突出了与搜索查询的研究较少。我们通过将查询相关摘要作为视频帧子集选择问题来解决这个问题,这使我们能够优化同时多样化、代表整个视频并且与文本查询相关的摘要。我们通过测量由神经网络诱导的常见文本-视觉语义嵌入空间中的帧和查询之间的距离来量化相关性。此外,我们扩展模型以捕获与查询无关的属性,例如帧质量。我们将我们的方法与先前用于缩略图选择的文本视觉嵌入技术进行比较,并表明我们的模型在相关性预测方面优于它们。此外,我们引入了一个新的数据集,用多样性和查询特定的相关性标签进行注释。在这个数据集上,我们训练和测试了我们完整的视频摘要模型,并表明它优于标准基线,例如最大边际相关性。
This dataset is useful for query-adaptive video summarization and is annotated with diversity and query-specific relevance labels. While the problem of automatic video summarization has received considerable attention in recent years, the task of generating video summaries tailored to search queries has been relatively understudied. We address this gap by framing query-relevant summarization as a video frame subset selection problem, which enables us to optimize summaries that are simultaneously diverse, representative of the entire video, and relevant to the textual query. We quantify relevance by measuring the distance between frames and the query within the shared text-visual semantic embedding space induced by a neural network. Furthermore, we extend our model to capture query-agnostic attributes such as frame quality. We compare our method against prior text-visual embedding techniques developed for thumbnail selection, and demonstrate that our model outperforms these baselines in relevance prediction. Additionally, we introduce a novel dataset annotated with diversity and query-specific relevance labels. Using this dataset, we train and evaluate our complete video summarization model, and show that it outperforms standard baselines such as Maximum Marginal Relevance (MMR).
提供机构:
OpenDataLab
创建时间:
2022-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
RAD数据集专为查询自适应视频摘要设计,提供了带有多样性和查询相关性标注的视频数据。它通过视频帧子集选择方法,优化摘要的多样性、代表性和与文本查询的相关性,并用于训练和测试相关模型,以提升摘要性能。该数据集由苏黎世联邦理工学院和Katholieke Universiteit Leuven于2017年发布。
以上内容由遇见数据集搜集并总结生成



