远场语音助手语音数据库
收藏北京国际大数据交易所2024-05-28 收录
下载链接:
https://webs.bjidex.com/sys-bsc-home/#/bscConsole/tradingMarket/detail?id=1924
下载链接
链接失效反馈官方服务:
资源简介:
AISHELL-ASR0070 远场语音助手语音数据库共 6200 小时,邀请 450 名来自中国南北方区域发言人参与,以北方为主。录音语言,中文;录音地区,中国。录音文本内容为语音助手交互内容。录制过程在真实家居环境中,客厅、卧室、厨房分别设置 5、4、3 个录音位,录制使用圆形 16 路 PDM 麦克风阵列录音板(16KHz、16bit;6083H)做远讲拾音和高保真麦克风做近讲拾音(44.1KHz、16bit;117H)。此数据库经过专业语音校对人员转写标注,并通过严格质量检验,正确率在 98%以上。可用于声纹识别、语音识别等研究使用。
The AISHELL-ASR0070 far-field speech assistant speech database contains a total of 6200 hours of audio data. It recruited 450 speakers from both southern and northern China, with the majority coming from northern regions. The recording language is Mandarin Chinese, and the recordings were collected in China. The transcribed text content corresponds to speech assistant interaction content. The recordings were conducted in real home environments, with 5, 4, and 3 recording positions arranged in the living room, bedroom, and kitchen respectively. For far-field sound pickup, a circular 16-channel PDM microphone array board (16KHz, 16bit; model 6083H) was used, while a high-fidelity microphone (44.1KHz, 16bit; model 117H) was employed for near-field sound pickup. This database has been transcribed and annotated by professional speech proofreaders, passed strict quality inspection with an annotation accuracy rate of over 98%, and can be utilized for research in fields such as speaker verification and speech recognition.
提供机构:
北京希尔贝壳科技有限公司
搜集汇总
数据集介绍

背景与挑战
背景概述
远场语音助手语音数据库包含6200小时的中文语音数据,由450名中国南北方发言人在真实家居环境中录制,使用专业设备进行远场和近场拾音。数据经过专业标注,正确率达98%以上,适用于声纹识别和语音识别研究。
以上内容由遇见数据集搜集并总结生成



