five

Shanghai Dialect Speech Data by Mobile Phone - 1,030 Hours

收藏
catalogue.elra.info2025-03-25 收录
下载链接:
https://catalogue.elra.info/en-us/repository/browse/ELRA-S0477/
下载链接
链接失效反馈
官方服务:
资源简介:
It collects 2.956 speakers from Shanghai and is recorded in quiet indoor environment. The recorded content includes multi-domain customer consultation, short messages, numbers, Shanghai POI, etc. The corpus has no repetition and the average sentence length is 12.68 words. Recording devices are mainstream Android phones and iPhones.Format:16kHz, 16bit, uncompressed wav, mono channelRecording Environment:quiet indoor environment, without echoRecording Content (read speech):generic category, human-machine interaction category, numbers, Shanghai POI.Speaker:2,956 people, 1,921 females, accounting for 65%Device:Android mobile phone and iPhoneLanguage:Shanghai dialectTranscription content:text, 4 noise symbols, special identifiersAccuracy rate:95% (the accuracy rate of noise symbols and other identifiers is not included)Application scenarios:speech recognition, voiceprint recognition

本数据集汇集了来自上海的2,956位说话者,录音环境为安静室内,内容涵盖多领域客户咨询、短信、数字、上海POI等。数据集无重复,平均句长为12.68词。录音设备为主流的安卓手机和iPhone。格式为16kHz,16位,未压缩的wav格式,单声道。录音环境为无回声的安静室内环境。录音内容包括通用类别、人机交互类别、数字、上海POI等。说话者包括2,956人,其中女性1,921人,占比65%。设备为安卓手机和iPhone。语言为上海方言。转录内容包含文本、4个噪声符号、特殊标识符。准确率为95%(不包括噪声符号和其他标识符的准确率)。应用场景包括语音识别、声纹识别。
提供机构:
ELRA Catalogue of Language Resources
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作