five

SiVi-CAFE dataset - Sighted and Visually-impaired Captions for Audio in Finnish and English

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11505822
下载链接
链接失效反馈
官方服务:
资源简介:
This is a dataset containing audio captions for audio files of the TAU Urban Acoustic Scenes 2019 development dataset (airport, public square, and park) for 10 cities.  The files were annotated using a web-based tool as presented in: Martin Morato, I., & Mesaros, A. (2021). Diversity and bias in audio captioning datasets. In F. Font, A. Mesaros, D. P.W. Ellis, E. Fonseca, M. Fuentes, & B. Elizalde (Eds.), Proceedings of the 6th Workshop on Detection and Classication of Acoustic Scenes and Events (DCASE 2021) (pp. 90-94) Each file is annotated by multiple annotators that provided a one-sentence description of the audio content. Data is provided in csv files: sighted-EN-bias-original sighted-FI-bias-translated sighted-EN-no_bias-original sighted-FI-no_bias-translated visually_impaired-FI-original visually_impaired-EN-translated sighted-FI-original sighted-EN-translated original = original descriptions, non-translatedtranslated = Translated descriptions using automatic deep learning tool 900 annotated audio files, Finnish audio descriptions provided by visual-impaired and sighted people.2050 annotated audio files, English audio descriptions provided by international students (not-necessarily English native-speakers).3930 annotated audio files, English audio descriptions provided by international students (not-necessarily English native-speakers) biased by the provided audio tags.   The audio files can be downloaded from https://zenodo.org/record/2589280 and are covered by their own license.
创建时间:
2024-06-12
二维码
社区交流群
二维码
科研交流群
商业服务