数据堂—67小时东北方言手机采集语音数据
收藏魔搭社区2025-11-25 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/DatatangBeijing/67Hours-NortheastDialectSpeechDataByMobilePhone
下载链接
链接失效反馈官方服务:
资源简介:
67小时东北方言手机采集语音数据由312名东北方言发音人参与录制,大部分来自于东北三省。所有发音人均使用东北方言朗读文本,录音内容丰富,覆盖近30多个领域的客户咨询以及短信文本。67小时东北方言手机采集语音数据的句子由专业标注人员人工转写校对,准确率高。
The 67-hour mobile-collected Northeastern Chinese dialect speech dataset was recorded by 312 Northeastern Chinese dialect speakers, most of whom are from the three provinces in Northeast China. All participants read the preset texts in Northeastern Chinese dialect. The recorded speech content is diverse, covering customer consultations and SMS texts across nearly 30 distinct fields. The sentence-level transcripts of this dataset were manually transcribed and proofread by professional annotators, ensuring high accuracy.
提供机构:
maas
创建时间:
2024-05-06
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含67小时的东北方言手机采集语音数据,由312名东北地区说话者录制,覆盖近30个领域内容,用于中文语音识别模型测试。数据格式为16kHz、16bit的wav单声道文件,标注准确率不低于98%。
以上内容由遇见数据集搜集并总结生成



