数据堂—1,351小时普通话自然对话语音数据(手机+录音笔)
收藏魔搭社区2025-11-29 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/DatatangBeijing/1351Hours-MandarinConversationalSpeechDataByMobilePhoneAndVoiceRecorder
下载链接
链接失效反馈官方服务:
资源简介:
1,351小时普通话自然对话语音数据(手机+录音笔)由1950名发音人参与录制,以自然方式进行面对面交流,针对给定的数个话题自由发挥,领域广泛,语音自然流利,符合实际对话场景。1,351小时普通话自然对话语音数据由人工转写文本,准确率高。
1,351 hours of Mandarin natural conversational speech data were collected using mobile phones and digital voice recorders, with 1,950 speakers participating in the recording sessions. The data was acquired through natural face-to-face interactions, where speakers freely discussed several pre-specified topics spanning a wide range of domains. The recorded speech is natural and fluent, aligning with real-world conversational scenarios. All 1,351 hours of the speech data have been manually transcribed into text with high accuracy.
提供机构:
maas
创建时间:
2024-05-06
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含1,351小时的普通话自然对话语音,由1,950名说话者使用手机和录音笔录制,覆盖多种话题。音频为16kHz/44.1kHz的WAV格式,主要用于测试普通话语音识别模型。
以上内容由遇见数据集搜集并总结生成



