未命名音频字幕数据集
收藏arXiv2019-07-22 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/1907.09238v1
下载链接
链接失效反馈官方服务:
资源简介:
本数据集由坦佩雷大学音频研究组创建,旨在通过众包方式收集音频字幕,以支持音频内容的文本描述任务。数据集包含5000个时长15至30秒的音频文件,每个文件配有5个不同的字幕。创建过程分为三个阶段:收集初始字幕、编辑字幕以修正错误和改进表达,以及评估字幕的质量。该数据集主要用于评估和开发音频字幕自动生成技术,解决多模态翻译中的音频描述问题。
This dataset was created by the Audio Research Group of Tampere University, aiming to collect audio captions via crowdsourcing to support text description tasks for audio content. The dataset contains 5000 audio clips with durations ranging from 15 to 30 seconds, each paired with 5 distinct captions. Its creation process is divided into three stages: collecting initial captions, editing captions to correct errors and improve expression, and evaluating caption quality. This dataset is primarily used for evaluating and developing automatic audio caption generation technologies to address the audio description problem in multimodal translation.
提供机构:
坦佩雷大学音频研究组
创建时间:
2019-07-22



