five

MAGICDATA 汉语普通话朗读语料数据库(训练数据集)

收藏
帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26453.html
下载链接
链接失效反馈
官方服务:
资源简介:
MAGICDATA Mandarin Chinese Read Speech Corpus was developed by MAGIC DATA Technology Co., Ltd. and freely published for non-commercial use. The contents and the corresponding descriptions of the corpus include: The corpus aims to support researchers in speech recognition, machine translation, speaker recognition, and other speech-related fields. Therefore, the corpus is totally free for academic use. The corpus is a subset of a much bigger data ( 10566.9 hours Chinese Mandarin Speech Corpus ) set which was recorded in the same environment. Please feel free to contact us via business@magicdatatech.com for more details. Citation Please cite the corpus as "Magic Data Technology Co., Ltd., "http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101", 05/2019". about us Magic Data Technology Co., Ltd. (referred to as Magic Data) was established in 2016. Through our higher-expertise and higher-precision data services, Magic Data has quickly grown into one of the foremost companies in artificial intelligence industry. We strive to provide the most efficient and highest quality one-stop data services for customers in the fields of speech recognition, intelligent imaging and Natural Language Understanding (NLU). Our services include data scheme design, data collection, data annotation/transcription, etc. Contact External URL: http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101 Full description from the company website

MAGICDATA 普通话朗读语音语料库(MAGICDATA Mandarin Chinese Read Speech Corpus)由麦吉数据科技股份有限公司(MAGIC DATA Technology Co., Ltd.)开发,免费面向非商业用途公开发布。该语料库的内容及相关说明如下:本语料库旨在为语音识别、机器翻译、说话人识别及其他语音相关领域的研究人员提供支持,因此完全对学术研究免费开放。本语料库是规模更大的数据集(10566.9小时普通话语音语料库)的子集,二者采录于同一环境。如需获取更多详情,可通过business@magicdatatech.com与我方联系。引用说明 引用该语料库时请标注为:"麦吉数据科技股份有限公司,http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101,2019年5月"。关于我们 麦吉数据科技股份有限公司(简称Magic Data)成立于2016年。凭借更高专业水准与更精准的数据服务,公司已迅速成长为人工智能行业的领先企业之一。我们致力于为语音识别、智能成像及自然语言理解(Natural Language Understanding, NLU)领域的客户提供高效且高品质的一站式数据服务,服务内容涵盖数据方案设计、数据采集、数据标注/转录等。联系方式 外部链接:http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101 以上内容来自公司官网完整说明。
提供机构:
帕依提提
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
MAGICDATA 汉语普通话朗读语料数据库(训练数据集)是一个由MAGIC DATA Technology Co., Ltd.开发的公开数据集,包含755小时的高质量普通话朗读语音,主要来自移动设备录制,涉及1080名来自中国不同口音区域的说话者,转录准确率高于98%。该数据集旨在支持语音识别、机器翻译和说话人识别等研究领域,并免费供学术使用,其文本领域多样化,包括交互问答、音乐搜索和社交媒体消息等。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务