five

RSL2019: A Realistic Speech Localization Corpus

收藏
Mendeley Data2024-03-27 更新2024-06-28 收录
下载链接:
https://zenodo.org/record/4925958
下载链接
链接失效反馈
官方服务:
资源简介:
We present a new database for speech localization that we refer to as Realistic Speech Localization 2019 (RSL2019) corpus. The corpus is designed for the study of sound source localization in real-world applications. The RSL2019 corpus is a continuing effort, which presently contains 22.60 hours of speech data, recorded using a four channel microphone array, and played over a loudspeaker from different directions of arrival (DOA). We consider 180speech utterances spoken by 6 speakers, selected from RSR2015database, which are played over the loudspeaker positioned at different angles and distances from the microphone array. We vary the DOA from 0 to 360 degree angle at an interval of 5degree, at 1 metre and 1.5 metre distance. From each position and DOA, we also record white noise to study the robustness, and time stretched pulse to generate the transfer function for speech localization algorithm. Furthermore, we present the experimental results and analysis on state-of-the-art sound source localization algorithm using the open source HARK toolkit on the created RSL2019 database. This database is provided for research purpose only. If you use this database, please cite the following paper. Rohan Sheelvant, Bidisha Sharma, Maulik Madhavi, Rohan Kumar Das, S.R.M. Prasanna and Haizhou Li, "RSL2019: A Realistic Speech Localization Corpus," 2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), 2019, pp. 1-6, doi: 10.1109/O-COCOSDA46868.2019.9060842. .
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作