SiVi-CAFE dataset - Sighted and Visually-impaired Captions for Audio in Finnish and English
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11505822
下载链接
链接失效反馈官方服务:
资源简介:
This is a dataset containing audio captions for audio files of the TAU Urban Acoustic Scenes 2019 development dataset (airport, public square, and park) for 10 cities.
The files were annotated using a web-based tool as presented in:
Martin Morato, I., & Mesaros, A. (2021). Diversity and bias in audio captioning datasets. In F. Font, A. Mesaros, D. P.W. Ellis, E. Fonseca, M. Fuentes, & B. Elizalde (Eds.), Proceedings of the 6th Workshop on Detection and Classication of Acoustic Scenes and Events (DCASE 2021) (pp. 90-94)
Each file is annotated by multiple annotators that provided a one-sentence description of the audio content.
Data is provided in csv files:
sighted-EN-bias-original
sighted-FI-bias-translated
sighted-EN-no_bias-original
sighted-FI-no_bias-translated
visually_impaired-FI-original
visually_impaired-EN-translated
sighted-FI-original
sighted-EN-translated
original = original descriptions, non-translatedtranslated = Translated descriptions using automatic deep learning tool
900 annotated audio files, Finnish audio descriptions provided by visual-impaired and sighted people.2050 annotated audio files, English audio descriptions provided by international students (not-necessarily English native-speakers).3930 annotated audio files, English audio descriptions provided by international students (not-necessarily English native-speakers) biased by the provided audio tags.
The audio files can be downloaded from https://zenodo.org/record/2589280 and are covered by their own license.
创建时间:
2024-06-12



