five

LibriWASN

收藏
Mendeley Data2024-05-10 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/10952434
下载链接
链接失效反馈
官方服务:
资源简介:
LibriWASN is a data set whose design is based on the LibriCSS data set. The main difference is that the data was recorded by distributed devices of an acoustic sensor network, randomly positioned on a meeting table. Thus, the microphone channels between the devices show a sampling rate offset. The data set with a total length of 20 hours was recorded in two acoustically different rooms. An acoustics lab with a room reverberation time of about 200ms and a lab room with about 800ms reverberation time. Nine different devices with different numbers of channels are available: Five smartphones with a single recording channel, 2 compact microphone arrays with 6 channels, 1 compact microphone array with 4 channels, and 1 circular microphone array with 8 channels. A total of 29 channels are available in the recordings. The same LibriSpeech sentences and speakers of the LibriCSS dataset were re-recorded and the directory structures of LibriCSS were kept. The data set is organized into subsets with different percentages of speech overlap (0% - 40%). LibriWASN can be used for various research purposes, e.g., as a test set for synchronization algorithms, speech separation, diarization, and meeting transcription systems in wireless acoustic ad-hoc sensor networks. Visit https://github.com/fgnt/libriwasn for tools and scripts. To cite this dataset please refer to @InProceedings{SchTgbHaeb2023, Title = {LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices}, Author = {Joerg Schmalenstroeer and Tobias Gburrek and Reinhold Haeb-Umbach}, Booktitle = {ITG conference on Speech Communication (ITG 2023)}, Year = {2023}, Month = {Sep}, } A preview of the paper is available from here: http://arxiv.org/abs/2308.10682

LibriWASN是一款基于LibriCSS数据集开发的专用数据集。二者的核心差异在于,本数据集的音频由随机布置于会议桌的分布式声学传感器网络(acoustic sensor network)设备采集,因此不同设备间的麦克风通道存在采样率偏移(sampling rate offset)。 本数据集总时长共计20小时,在两间声学特性迥异的房间中采集:一间为混响时间约200ms的声学实验室,另一间为混响时间约800ms的实验室房间。 本次采集共用到9款不同通道数的设备:5款单录制通道的智能手机、2款6通道紧凑型麦克风阵列、1款4通道紧凑型麦克风阵列,以及1款8通道环形麦克风阵列,本次录制总计可用通道数为29个。 本次采集复用了LibriCSS数据集所使用的LibriSpeech语句与说话人样本,并完整保留了LibriCSS的目录结构。 本数据集按照语音重叠率(speech overlap)划分为多个子集,重叠率覆盖范围为0%至40%。 LibriWASN可应用于多种研究场景,例如作为无线自组织声学传感器网络(wireless acoustic ad-hoc sensor networks)中的同步算法、语音分离(speech separation)、说话人diarization(diarization)、会议转录系统的测试集。 相关工具与脚本可通过https://github.com/fgnt/libriwasn 获取。 若需引用该数据集,请参阅以下文献: @InProceedings{SchTgbHaeb2023, Title = {LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices}, Author = {Joerg Schmalenstroeer and Tobias Gburrek and Reinhold Haeb-Umbach}, Booktitle = {ITG conference on Speech Communication (ITG 2023)}, Year = {2023}, Month = {Sep}, } 该论文的预览版可从以下链接获取:http://arxiv.org/abs/2308.10682
创建时间:
2024-04-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作