MAGICDATA 汉语普通话朗读语料数据库（训练数据集）

Name: MAGICDATA 汉语普通话朗读语料数据库（训练数据集）
Creator: 帕依提提
License: 暂无描述

帕依提提2024-03-04 收录

下载链接：

https://www.payititi.com/opendatasets/show-26453.html

下载链接

链接失效反馈

官方服务：

资源简介：

MAGICDATA Mandarin Chinese Read Speech Corpus was developed by MAGIC DATA Technology Co., Ltd. and freely published for non-commercial use. The contents and the corresponding descriptions of the corpus include: The corpus aims to support researchers in speech recognition, machine translation, speaker recognition, and other speech-related fields. Therefore, the corpus is totally free for academic use. The corpus is a subset of a much bigger data ( 10566.9 hours Chinese Mandarin Speech Corpus ) set which was recorded in the same environment. Please feel free to contact us via business@magicdatatech.com for more details. Citation Please cite the corpus as "Magic Data Technology Co., Ltd., "http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101", 05/2019". about us Magic Data Technology Co., Ltd. (referred to as Magic Data) was established in 2016. Through our higher-expertise and higher-precision data services, Magic Data has quickly grown into one of the foremost companies in artificial intelligence industry. We strive to provide the most efficient and highest quality one-stop data services for customers in the fields of speech recognition, intelligent imaging and Natural Language Understanding (NLU). Our services include data scheme design, data collection, data annotation/transcription, etc. Contact External URL: http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101 Full description from the company website

MAGICDATA 普通话朗读语音语料库（MAGICDATA Mandarin Chinese Read Speech Corpus）由麦吉数据科技股份有限公司（MAGIC DATA Technology Co., Ltd.）开发，免费面向非商业用途公开发布。该语料库的内容及相关说明如下：本语料库旨在为语音识别、机器翻译、说话人识别及其他语音相关领域的研究人员提供支持，因此完全对学术研究免费开放。本语料库是规模更大的数据集（10566.9小时普通话语音语料库）的子集，二者采录于同一环境。如需获取更多详情，可通过business@magicdatatech.com与我方联系。引用说明引用该语料库时请标注为："麦吉数据科技股份有限公司，http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101，2019年5月"。关于我们麦吉数据科技股份有限公司（简称Magic Data）成立于2016年。凭借更高专业水准与更精准的数据服务，公司已迅速成长为人工智能行业的领先企业之一。我们致力于为语音识别、智能成像及自然语言理解（Natural Language Understanding, NLU）领域的客户提供高效且高品质的一站式数据服务，服务内容涵盖数据方案设计、数据采集、数据标注/转录等。联系方式外部链接：http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101 以上内容来自公司官网完整说明。

提供机构：

帕依提提

搜集汇总

数据集介绍

背景与挑战

背景概述

MAGICDATA 汉语普通话朗读语料数据库（训练数据集）是一个由MAGIC DATA Technology Co., Ltd.开发的公开数据集，包含755小时的高质量普通话朗读语音，主要来自移动设备录制，涉及1080名来自中国不同口音区域的说话者，转录准确率高于98%。该数据集旨在支持语音识别、机器翻译和说话人识别等研究领域，并免费供学术使用，其文本领域多样化，包括交互问答、音乐搜索和社交媒体消息等。

以上内容由遇见数据集搜集并总结生成