数据堂—1,025小时重口音普通话手机采集语音数据

Name: 数据堂—1,025小时重口音普通话手机采集语音数据
Creator: maas
Published: 2025-12-02 10:51:18
License: 暂无描述

魔搭社区2025-12-02 更新2024-05-15 收录

下载链接：

https://modelscope.cn/datasets/DatatangBeijing/1025Hours-MandarinStrongAccentSpeechDataByMobilePhone

下载链接

链接失效反馈

官方服务：

资源简介：

1,025小时重口音普通话手机采集语音数据是由2000余名中国本土发音人参与录制，南方为主，并覆盖部分北方重口音省份，男女均衡。语音均采用较重口音的普通话录制，富有地域特点。本套数据里录音内容丰富，涵盖手机语音助手交互、智能家居命令、车载命令词、数字等多种类别，精准匹配智能家居、智能车载等实际应用场景

The 1,025-hour heavily accented Mandarin speech dataset was collected via mobile phones. Over 2,000 native Chinese speakers participated in the recording process, with most speakers originating from southern China and some from northern provinces with heavy accents, and the gender distribution is balanced. All recordings were produced in heavily accented Mandarin, showcasing distinct regional characteristics. The dataset contains diverse recording content covering multiple categories including mobile voice assistant interactions, smart home commands, in-vehicle command terms, and numerical content, which precisely aligns with practical application scenarios such as smart home and intelligent in-vehicle systems.

提供机构：

maas

创建时间：

2024-05-07

搜集汇总

数据集介绍