数据堂—3,255小时中国儿童手机采集语音数据
收藏魔搭社区2025-12-04 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/DatatangBeijing/3255Hours-ChineseChildrenSpeechdataByMobilephone
下载链接
链接失效反馈官方服务:
资源简介:
中国儿童手机采集语音数据,数据时长3,255小时,发音人均为6~12岁儿童,人数约 9,780 人。口音覆盖七大方言区;录音文本包含作文故事、数字等儿童常用句子,以及车载、家居、语音助手的交互,精准契合实际应用场景。所有句子均由人工转写,准确率高
This is a speech dataset collected from mobile phones of Chinese children. It has a total duration of 3,255 hours, with approximately 9,780 speakers aged 6 to 12 years old. The covered accents span seven major Chinese dialect regions. The recorded texts include commonly used children's sentences such as compositions, stories and numbers, as well as interactions in in-vehicle, smart home and voice assistant scenarios, which precisely align with real-world application scenarios. All sentences are manually transcribed with high accuracy.
提供机构:
maas
创建时间:
2024-05-06
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含3,255小时的中国儿童语音数据,采集自6至12岁儿童,覆盖七大主要方言区域,涉及约9,780名说话者。数据内容涵盖作文、故事、数字及人机交互等场景,所有句子均经过高准确度人工转录,用于测试中国儿童语音识别模型,数据格式为WAV音频,采集于安静室内环境。
以上内容由遇见数据集搜集并总结生成



