数据堂—248小时杭州方言手机采集语音数据
收藏魔搭社区2025-12-04 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/DatatangBeijing/248Hours-HangzhouDialectSpeechDataByMobilePhone
下载链接
链接失效反馈官方服务:
资源简介:
248小时杭州方言手机采集语音数据由370名发音人,每人约500句杭州方言口语化句子采集得来。挑选杭州本地人员参与录音,口音正宗。录音内容口语化、生活化,朗读更加自然流利。248小时杭州方言手机采集语音数据由杭州本地人参与质检校对,文本转写更精准。匹配主流安卓、苹果系统手机
The 248-hour Hangzhou dialect speech dataset was collected from 370 speakers, with each speaker providing approximately 500 colloquial daily Hangzhou dialect utterances. All participants are local Hangzhou residents with authentic local accents. The recorded content is colloquial and oriented towards daily life, with natural and fluent speech delivery. This dataset was proofread and verified by local Hangzhou residents, resulting in more accurate text transcriptions. It is compatible with mainstream Android and Apple iOS smartphones.
提供机构:
maas
创建时间:
2024-05-06
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含248小时通过手机采集的杭州方言语音,用于中文语音识别模型的测试任务。数据来自370位杭州本地居民,每人录制约500句日常对话,并经过质量检查和转写验证,录音格式为16kHz、16bit的单声道WAV文件。
以上内容由遇见数据集搜集并总结生成



