AISHELL-1
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2018S14
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>AISHELL-1 was developed by <a href="http://www.aishelltech.com/sy">Beijing Shell Shell Technology Co., Ltd.</a> It contains approximately 520 hours of Chinese Mandarin speech from 400 speakers recorded simultaneously on three different devices with associated transcripts.</p><br>
<p>The goal of the collection was to support speech recognition system development in 11 domains, five of which are include in this corpus: Finance, Science & Technology, Sports, Entertainment, and News. Participants read 500 sentences covering the domains; sentences were chosen for their speech and phonetic characteristics.</p><br>
<p>Speakers were recruited from different accent areas across China, including North, South and Yue-Gui-Min regions. There were 214 female speakers and 186 male speakers, constituting 53% and 47% of the database, respectively. Additional demographic information about the participants is included in this release.</p><br>
<h3>Data</h3><br>
<p>Speech was recorded in a quiet indoor environment on a high fidelity microphone and two mobile phones (Android and iOS). All speech is presented as 16-bit flac compressed wav files; the microphone speech sample rate is 44.1kHz and the phone speech sample rate is 16kHz. Each speech file ranges from approximately 1 second to 14 seconds in length.</p><br>
<p>Transcripts are stored as UTF-8 encoded plain text files and are not time-aligned.</p><br>
<h3>Samples</h3><br>
<p>Please view the following samples:</p><br>
<ul><br>
<li><a href="desc/addenda/LDC2018S14.mic.flac">Microphone</a></li><br>
<li><a href="desc/addenda/LDC2018S14.and.flac">Android</a></li><br>
<li><a href="desc/addenda/LDC2018S14.ios.flac">iOS</a></li><br>
<li><a href="desc/addenda/LDC2018S14.txt">Transcript</a></li><br>
</ul><br>
<h3>Updates</h3><br>
<p>None at this time.</p></br>
Portions © 2018 Beijing Shell Shell Technology Co., Ltd., © 2018 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



