SiVi-CAFE dataset - Sighted and Visually-impaired Captions for Audio in Finnish and English

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/11505822

下载链接

链接失效反馈

官方服务：

资源简介：

This is a dataset containing audio captions for audio files of the TAU Urban Acoustic Scenes 2019 development dataset (airport, public square, and park) for 10 cities. The files were annotated using a web-based tool as presented in: Martin Morato, I., & Mesaros, A. (2021). Diversity and bias in audio captioning datasets. In F. Font, A. Mesaros, D. P.W. Ellis, E. Fonseca, M. Fuentes, & B. Elizalde (Eds.), Proceedings of the 6th Workshop on Detection and Classication of Acoustic Scenes and Events (DCASE 2021) (pp. 90-94) Each file is annotated by multiple annotators that provided a one-sentence description of the audio content. Data is provided in csv files: sighted-EN-bias-original sighted-FI-bias-translated sighted-EN-no_bias-original sighted-FI-no_bias-translated visually_impaired-FI-original visually_impaired-EN-translated sighted-FI-original sighted-EN-translated original = original descriptions, non-translatedtranslated = Translated descriptions using automatic deep learning tool 900 annotated audio files, Finnish audio descriptions provided by visual-impaired and sighted people.2050 annotated audio files, English audio descriptions provided by international students (not-necessarily English native-speakers).3930 annotated audio files, English audio descriptions provided by international students (not-necessarily English native-speakers) biased by the provided audio tags. The audio files can be downloaded from https://zenodo.org/record/2589280 and are covered by their own license.

创建时间：

2024-06-12