five

Multi-Sensor Voice Command Dataset

收藏
DataCite Commons2025-03-24 更新2025-04-16 收录
下载链接:
https://rdr.kuleuven.be/citation?persistentId=doi:10.48804/IEKKVZ
下载链接
链接失效反馈
官方服务:
资源简介:
The repository includes the audio files recorded from a wireless audio sensor network of 4 sensors. A total of 20 volunteers were instructed to repeat five English voice commands ("Lights On", "Lights Off", "Music On", "Music Stop", "Next Song") from three different positions, for a total of 15 recordings per keyword. Only in a few cases, the number of samples is reduced to 14 after manual data cleaning. Additionally, we registered 15 per-speaker generic spoken utterances, e.g., "Set an alarm to 7am", which are used as negative examples. Given a duration of 3 seconds per utterance, we collected a total of 1.5 hours of audio per sensor. Additionally, the negative data were augmented with 4.4 hours of recordings obtained by replaying the audio files from the test-clean and dev-clean sets of Librispeech using a set of speakers. The audio files are available from the weblink https://www.openslr.org/12 (dev-clean and test-clean repository), which are freely distributed under the CC-BY-4.0 license. Every recording was limited to 3 seconds, leading to a total amount of 5.9 hours of audio per sensor in our multi-sensor dataset. This dataset favors the investigation of new speech recognition algorithms for audio data recorded with a network of ultra-low-power smart audio sensors. The application is voice command recognition, also known as keyword spotting, where the algorithm must recognize a voice command (e.g. "Lights On"), distinguishing it from other voice commands or other audio tracks, i.e. the negative data. Within a wireless audio sensor network scenario, the speech recognition algorithms are fed with the audio data recorded from multiple sensors, which are located in the environment.
提供机构:
KU Leuven RDR
创建时间:
2024-10-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作