five

Spoken Digits in Hindi and Indian English

收藏
DataCite Commons2022-02-15 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2022S03
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3><br> <p>Spoken Digits in Hindi and Indian English was developed by the <a href="https://www.bits-pilani.ac.in/goa/">Birla Institute of Technology and Science Pilani</a>. It contains approximately two hours of speech comprised of spoken digits from one to ten in Hindi and English with regional accents from across India.</p><br> <h3>Data</h3><br> <p>The speech data was collected as follows: in person, on a mobile handset recorder app; via one-to-one online communications over social apps; and from social media sites. Each audio file represents a single spoken digit in either Hindi or Indian English. Background noise was mostly retained. Some data was recorded in a noise-free environment or cleaned after recording to avoid abrupt noises such as car horns.</p><br> <p>The audio data is organized by number, language and gender. The gender breakdown for speakers is 17% female, 27% male, and 56% unspecified.</p><br> <p>A Google Colab Notebook file which can be used for basic functionalities such as removing noise or unwanted spaces is also included in this release.</p><br> <p>All audio data is presented as single channel 16-bit 16kHz flac compressed linear PCM.</p><br> <h3>Samples</h3><br> <p>Please view these samples:</p><br> <ul><br> <li><a href="desc/addenda/LDC2022S03-hin-f.flac">Hindi Female (FLAC)</a></li><br> <li><a href="desc/addenda/LDC2022S03-eng-u.flac">English Unspecified (FLAC)</a></li><br> <li><a href="desc/addenda/LDC2022S03-eng-m.flac">English Male (FLAC)</a></li><br> </ul><br> <h3>Updates</h3><br> <p>None at this time.</p></br> Portions © 2022 Basabdatta Sen Bhattacharya, © 2022 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2022-02-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作