数据堂—110小时河南方言手机采集语音数据
收藏魔搭社区2026-05-11 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/DatatangBeijing/110Hours_HenanDialectSpeechDataByMobilePhone
下载链接
链接失效反馈官方服务:
资源简介:
110小时河南方言手机采集语音数据总计包括463名发音人,由河南本地人员参与录音,口音正宗。录音内容广泛,覆盖日常短信及多领域客户咨询。110小时河南方言手机采集语音数据同时由河南本地人参与质检校对,文本转写更精准。匹配主流安卓、苹果系统手机
This 110-hour Henan dialect speech dataset collected via mobile devices includes a total of 463 speakers. Recordings were conducted by local Henan residents, ensuring authentic Henan dialect accents. The recorded content covers a wide range of scenarios, including daily text message-related situations and customer inquiries across multiple domains. This dataset was also verified and proofread by local Henan residents, leading to more accurate text transcriptions. It is compatible with mainstream Android and Apple iOS mobile phones.
提供机构:
maas
创建时间:
2024-05-07
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含110小时河南方言手机采集语音数据,由463名河南本地居民录制,内容覆盖日常对话和客服咨询,用于中文语音识别模型测试。音频格式为16kHz/16bit单声道WAV,经过质量检查,句子转录准确率不低于98%,为商业数据。
以上内容由遇见数据集搜集并总结生成



