Spoken Digits in Hindi and Indian English
收藏DataCite Commons2022-02-15 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2022S03
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>Spoken Digits in Hindi and Indian English was developed by the <a href="https://www.bits-pilani.ac.in/goa/">Birla Institute of Technology and Science Pilani</a>. It contains approximately two hours of speech comprised of spoken digits from one to ten in Hindi and English with regional accents from across India.</p><br>
<h3>Data</h3><br>
<p>The speech data was collected as follows: in person, on a mobile handset recorder app; via one-to-one online communications over social apps; and from social media sites. Each audio file represents a single spoken digit in either Hindi or Indian English. Background noise was mostly retained. Some data was recorded in a noise-free environment or cleaned after recording to avoid abrupt noises such as car horns.</p><br>
<p>The audio data is organized by number, language and gender. The gender breakdown for speakers is 17% female, 27% male, and 56% unspecified.</p><br>
<p>A Google Colab Notebook file which can be used for basic functionalities such as removing noise or unwanted spaces is also included in this release.</p><br>
<p>All audio data is presented as single channel 16-bit 16kHz flac compressed linear PCM.</p><br>
<h3>Samples</h3><br>
<p>Please view these samples:</p><br>
<ul><br>
<li><a href="desc/addenda/LDC2022S03-hin-f.flac">Hindi Female (FLAC)</a></li><br>
<li><a href="desc/addenda/LDC2022S03-eng-u.flac">English Unspecified (FLAC)</a></li><br>
<li><a href="desc/addenda/LDC2022S03-eng-m.flac">English Male (FLAC)</a></li><br>
</ul><br>
<h3>Updates</h3><br>
<p>None at this time.</p></br>
Portions © 2022 Basabdatta Sen Bhattacharya, © 2022 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2022-02-11



