Auto-ACD
收藏arXiv2023-10-03 更新2024-06-21 收录
下载链接:
https://auto-acd.github.io/
下载链接
链接失效反馈官方服务:
资源简介:
Auto-ACD是由上海交通大学构建的大规模音频-语言数据集,包含超过190万对音频-文本数据。该数据集通过自动音频字幕生成管道创建,利用公共工具和API自动生成音频描述。Auto-ACD不仅描述声音类型和来源,还包含声音发生的具体环境信息。该数据集适用于音频-语言检索、音频字幕生成和环境分类等任务,旨在解决现有音频-语言数据集的不足,如数据量不足、内容简单和收集过程复杂等问题。
Auto-ACD is a large-scale audio-language dataset constructed by Shanghai Jiao Tong University, containing over 1.9 million audio-text pairs. This dataset is developed through an automatic audio captioning pipeline, which leverages public tools and APIs to automatically generate audio descriptions. Auto-ACD not only describes the sound types and their sources, but also includes specific environmental information related to the occurrence of the sounds. It is applicable to tasks such as audio-language retrieval, audio caption generation and environmental sound classification. This dataset aims to address the shortcomings of existing audio-language datasets, such as insufficient data volume, overly simplistic content and complex collection processes.
提供机构:
上海交通大学
创建时间:
2023-09-21



