YouTube8M-MusicTextClips
收藏Mendeley Data2024-05-10 更新2024-06-29 收录
下载链接:
https://zenodo.org/records/8040754
下载链接
链接失效反馈官方服务:
资源简介:
YouTube8M-MusicTextClips Dataset This page includes the YouTube8M-MusicTextClips dataset from our CVPR 2023 paper: Language-Guided Music Recommendation for Video via Prompt Analogies Daniel McKee1, Justin Salamon2, Josef Sivic2,3, Bryan Russell2 1University of Illinois at Urbana-Champaign, 2Adobe Research, 3Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University The dataset is licensed under a Research-only, non-commercial Adobe Research License. Please see our attached LICENSE file for more information. Dataset Description The YouTube8M-MusicTextClips dataset consists of over 4k high-quality human text descriptions of music found in video clips from the YouTube8M dataset. For each selected YouTube music video, we extracted 10 second clips at the middle of the video for annotation. We provided annotators with only the audio corresponding to this clip. Thus, text annotations describe audio alone, not the visual content of the clip. The dataset annotations are divided into train and test split files. As the dataset is meant mainly for evaluation, there are 3169 annotated clips from the test set and only 1000 annotated clips from the train set. Each file contains the following information for each sample: video_id: The YouTube ID corresponding to the video containing an annotated clip start: Start time (in seconds) of the annotated clip in the video end: End time (in seconds) of the annotated clip in the video text: The text annotation describing the music from the annotated clip For more information, please check our project page and paper: https://www.danielbmckee.com/language-guided-music-for-video/ Citation If you use this dataset, please cite our paper: McKee, D., Salamon, J., Sivic, J., & Russell, B. (2023). Language-Guided Music Recommendation for Video via Prompt Analogies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023). Bibtex: @InProceedings{mckee2023language, author = {McKee, Daniel and Salamon, Justin and Sivic, Josef and Russell, Bryan}, title = {Language-Guided Music Recommendation for Video via Prompt Analogies}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2023}, }
YouTube8M-MusicTextClips 数据集
本页面收录了来自我们2023年CVPR论文《基于提示类比的视频语言引导音乐推荐》的YouTube8M-MusicTextClips数据集。
作者:Daniel McKee¹, Justin Salamon², Josef Sivic²,³, Bryan Russell²
单位:¹伊利诺伊大学厄巴纳-香槟分校,²Adobe研究院,³捷克理工大学信息学、机器人与控制论捷克研究所
本数据集仅可用于非商业性质的研究用途,遵循Adobe研究院许可协议,详细信息请参阅附带的LICENSE文件。
数据集说明
YouTube8M-MusicTextClips数据集包含超过4000条高质量的人工文本描述,这些描述针对YouTube8M数据集中视频片段里的音乐内容。针对每一个入选的YouTube音乐视频,我们提取了视频中段时长为10秒的片段用于标注。我们仅向标注人员提供该片段的音频内容,因此文本标注仅描述该片段的音频(音乐)信息,而非片段的视觉内容。
本数据集的标注被划分为训练集与测试集拆分文件。由于本数据集主要用于评估用途,测试集包含3169条已标注片段,训练集仅包含1000条已标注片段。每个文件包含每个样本的以下信息:
- video_id:对应包含标注片段的YouTube视频的ID
- start:标注片段在原视频中的起始时间(单位:秒)
- end:标注片段在原视频中的结束时间(单位:秒)
- text:描述该标注片段中音乐内容的文本标注
如需了解更多信息,请访问我们的项目页面与论文:https://www.danielbmckee.com/language-guided-music-for-video/
引用说明
若您使用本数据集,请引用我们的论文:
McKee, D., Salamon, J., Sivic, J., & Russell, B. (2023). Language-Guided Music Recommendation for Video via Prompt Analogies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023).
Bibtex引用格式:
@InProceedings{mckee2023language,
author = {McKee, Daniel and Salamon, Justin and Sivic, Josef and Russell, Bryan},
title = {Language-Guided Music Recommendation for Video via Prompt Analogies},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2023},
}
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
YouTube8M-MusicTextClips数据集包含超过4k条高质量的人类文本描述,针对YouTube8M数据集中10秒音乐片段的音频内容进行标注。该数据集主要用于评估,包含3169个测试集和1000个训练集的标注样本,每个样本提供视频ID、片段时间和文本描述信息。
以上内容由遇见数据集搜集并总结生成



