SOUNDDESCS

Name: SOUNDDESCS
Creator: 图宾根大学可解释机器学习组
Published: 2022-01-28 05:00:58
License: 暂无描述

arXiv2022-01-28 更新2024-06-21 收录

下载链接：

https://github.com/akoepke/audio-retrieval-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

SOUNDDESCS数据集由图宾根大学可解释机器学习组创建，包含32,979个音频文件及其自然语言描述，旨在支持基于文本的音频检索研究。数据集来源于BBC音效网页，涵盖多种声音类别，如自然、钟表、火灾等。音频文件时长和描述词汇量均呈现广泛变化，为音频与文本跨模态检索提供了丰富的基准。该数据集特别适用于研究不受约束的文本查询与音频内容之间的匹配，有助于开发更自然、灵活的音频检索系统，以及在低功耗物联网设备、历史声音档案和创意应用等领域的应用。

The SOUNDDESCS dataset was developed by the Interpretable Machine Learning Group at the University of Tübingen. It comprises 32,979 audio files paired with their natural language descriptions, and is designed to support text-based audio retrieval research. Sourced from BBC Sound Effects web pages, the dataset covers diverse sound categories including natural ambient sounds, clocks, fires, and more. The durations of the audio files and the lengths of their descriptive texts vary significantly, making it a rich benchmark for cross-modal audio-text retrieval research. Specifically, this dataset is well-suited for investigating the alignment between unconstrained text queries and audio content, which facilitates the development of more natural and flexible audio retrieval systems, as well as applications across domains such as low-power IoT devices, historical sound archives, and creative applications.

提供机构：

图宾根大学可解释机器学习组

创建时间：

2021-12-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集