AISHELL-1

Mendeley Data2024-01-31 更新2024-06-28 收录

下载链接：

https://catalog.ldc.upenn.edu/LDC2018S14

下载链接

链接失效反馈

官方服务：

资源简介：

Introduction AISHELL-1 was developed by Beijing Shell Shell Technology Co., Ltd. It contains approximately 520 hours of Chinese Mandarin speech from 400 speakers recorded simultaneously on three different devices with associated transcripts. The goal of the collection was to support speech recognition system development in 11 domains, five of which are include in this corpus: Finance, Science & Technology, Sports, Entertainment, and News. Participants read 500 sentences covering the domains; sentences were chosen for their speech and phonetic characteristics. Speakers were recruited from different accent areas across China, including North, South and Yue-Gui-Min regions. There were 214 female speakers and 186 male speakers, constituting 53% and 47% of the database, respectively. Additional demographic information about the participants is included in this release. Data Speech was recorded in a quiet indoor environment on a high fidelity microphone and two mobile phones (Android and iOS). All speech is presented as 16-bit flac compressed wav files; the microphone speech sample rate is 44.1kHz and the phone speech sample rate is 16kHz. Each speech file ranges from approximately 1 second to 14 seconds in length. Transcripts are stored as UTF-8 encoded plain text files and are not time-aligned. Samples Please view the following samples: Microphone Android iOS Transcript Updates None at this time. Portions © 2018 Beijing Shell Shell Technology Co., Ltd., © 2018 Trustees of the University of Pennsylvania

数据集介绍：AISHELL-1由北京贝壳壳科技有限公司（Beijing Shell Shell Technology Co., Ltd.）研发。该数据集包含来自400名发音人的约520小时汉语普通话语音数据，采用三种不同设备同步录制，并附带对应的语音转写文本。本次数据采集的目标是支撑11个领域的语音识别系统研发，本次公开的语料库涵盖其中5个领域：金融、科技、体育、娱乐与新闻。参与发音的人员需朗读覆盖上述领域的500句语句，这些语句均依据语音及语音学特征精心筛选。发音人招募自中国国内不同方言区域，包括北方、南方以及粤桂闽地区。本次数据集共有女性发音人214名、男性发音人186名，分别占总人数的53%与47%。本次发布还附带了参与者的额外人口统计信息。数据说明：语音数据均在安静的室内环境中录制，采集设备包含一支高保真麦克风与两款移动电话（分别搭载Android系统与iOS系统）。所有语音均以16位FLAC压缩WAV文件格式存储，其中麦克风采集的语音采样率为44.1kHz，手机采集的语音采样率为16kHz。单条语音文件的时长范围约为1秒至14秒。语音转写文本以UTF-8编码的纯文本文件形式存储，未进行时间对齐。示例数据：请查看以下示例：麦克风采集语音样本、Android手机采集语音样本、iOS手机采集语音样本、语音转写文本。更新说明：暂无本次更新内容。版权声明：部分内容 © 2018 北京贝壳壳科技有限公司，© 2018 宾夕法尼亚大学托管委员会

创建时间：

2024-01-31

搜集汇总

数据集介绍

背景与挑战

背景概述

AISHELL-1是一个中文普通话语音识别数据集，由北京Shell Shell Technology Co., Ltd.开发，包含约520小时的语音数据，来自400名说话者，使用高保真麦克风和两种手机同时录制，覆盖金融、科技、体育、娱乐和新闻五个领域。该数据集旨在支持语音识别系统开发，说话者来自中国不同口音区域，性别分布均衡（53%女性，47%男性），语音文件以FLAC格式提供，采样率分别为44.1kHz和16kHz，并附带UTF-8编码的转录文本。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集