Hindko Voice Dataset (HVD)

Name: Hindko Voice Dataset (HVD)
Creator: Mendeley Data
License: 暂无描述

doi.org2025-03-22 收录

下载链接：

http://doi.org/10.17632/yjhz8z7mv5.1

下载链接

链接失效反馈

官方服务：

资源简介：

Hindko is a language that is mostly spoken in Northwestern areas of Pakistan. There are 8 million people that speak Hindko Language. According to their native speakers it is 7th largest language of Pakistan and 2nd Largest Language of Khyber Pakhtunkhwa. Hazara Region is cultural hub of Hindko language. About 80% of population of districts like Haripur, Abbotabad and Mansehra Speak Hindko. Speaking content of Hindko cover a wide range of subjects including religion, education, poetry, politics, theater, and many more. Despite all these Hindko need a voice recognition system that enhance accessibility, preserve the language, and include digital inclusion for its speakers. Dataset consists of 20 hindko numbers from 1 to 20. We asked every individual to speak these 20 numbers in one recording and send it on WhatsApp. Round about 300 individuals participated in this project. We have taken 3 samples from every individuals. Then we use audacity software and split every number from the recording, and saved in separa

Hindko语主要流行于巴基斯坦西北部地区。使用该语言的民众约为八百万。根据其母语者的说法，Hindko语是巴基斯坦第七大语言，同时在开伯尔-普什图罕省中位居第二大。哈扎拉地区是Hindko文化的中心地带。在哈里普尔、阿伯特阿巴德和曼舍拉等地区，约80%的居民使用Hindko语。Hindko语所涵盖的说话内容涉及宗教、教育、诗歌、政治、戏剧等多个领域。尽管如此，Hindko语仍亟需一个能够提升其可访问性、保护该语言并为其使用者实现数字融入的语音识别系统。数据集包含从1至20的20个Hindko数字。我们要求每位参与者分别录制这20个数字，并通过WhatsApp发送。大约有300人参与了该项目。我们对每位参与者的录音各取三个样本。随后，我们使用Audacity软件将每个数字从录音中分割出来，并单独保存。

提供机构：

Mendeley Data

5,000+

优质数据集

54 个

任务类型

进入经典数据集