Falah/classification_arabic_dialects

Name: Falah/classification_arabic_dialects
Creator: Falah
Published: 2023-07-03 09:56:39
License: 暂无描述

Hugging Face2023-07-03 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/Falah/classification_arabic_dialects

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: audio dtype: audio - name: label dtype: class_label: names: '0': Algeria '1': Egypt '2': Iraq '3': Jordan '4': Morocco '5': Saudi_Arabia '6': Sudan '7': Syria '8': Tunisia '9': Yemen splits: - name: train num_bytes: 166407297.0 num_examples: 130 download_size: 158117904 dataset_size: 166407297.0 --- # Classification of Arabic Dialects Audio Dataset This dataset contains audio samples of various Arabic dialects for the task of classification and recognition. The dataset aims to assist researchers and practitioners in developing models and systems for Arabic spoken language analysis and understanding. ## Dataset Details - Dataset Name: Classification of Arabic Dialects Audio Dataset - Dataset URL: [Falah/classification_arabic_dialects](https://huggingface.co/datasets/Falah/classification_arabic_dialects) - Dataset Size: 166,407,297 bytes - Download Size: 158,117,904 bytes - Splits: - Train: 130 examples ## Class Labels and Mapping The dataset consists of audio samples from the following Arabic dialects, along with their corresponding class labels: - '0': Algeria - '1': Egypt - '2': Iraq - '3': Jordan - '4': Morocco - '5': Saudi Arabia - '6': Sudan - '7': Syria - '8': Tunisia - '9': Yemen Please refer to the dataset for the audio samples and their respective class labels. ## Usage Example To play and display an audio sample from the dataset, you can use the following code: ```python from IPython.display import Audio country_names = ['Algeria', 'Egypt', 'Iraq', 'Jordan', 'Morocco', 'Saudi_Arabia', 'Sudan', 'Syria', 'Tunisia', 'Yemen'] index = 0 # Index of the audio example label = dataset["train"][index]["label"] country_name = country_names[int(label)] audio_data = dataset["train"][index]["audio"]["array"] sampling_rate = dataset["train"][index]["audio"]["sampling_rate"] # Play audio display(Audio(audio_data, rate=sampling_rate)) print("Class Label:", label) print("Country Name:", country_name) ``` Make sure to replace `index` with the desired index of the audio example. This code will play the audio, display it, and print its associated class label and the matched country name from the `country_names` list. ## Applications The Classification of Arabic Dialects Audio Dataset can be utilized in various applications, including but not limited to: - Arabic dialect classification - Arabic spoken language recognition - Speech analysis and understanding for Arabic dialects - Acoustic modeling for Arabic dialects - Cross-dialect speech processing and synthesis Feel free to explore and leverage this dataset for your research and development tasks related to Arabic spoken language analysis and recognition. ## License The dataset is made available under the terms of the [Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/) license. ## Citation If you use this dataset in your research or any other work, please consider citing it as For more information or inquiries about the dataset, please contact the dataset author(s) mentioned in the citation. ``` @dataset{classification_arabic_dialects, author = {Falah.G.Salieh}, title = {Classification of Arabic Dialects Audio Dataset}, year = {2023}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/Falah/classification_arabic_dialects}, } ```

提供机构：

Falah

原始信息汇总

阿拉伯方言分类音频数据集

数据集详情

数据集名称: 阿拉伯方言分类音频数据集
数据集大小: 166,407,297 字节
下载大小: 158,117,904 字节
拆分:
- 训练集: 130 个样本

类别标签及映射

数据集包含以下阿拉伯方言的音频样本及其对应的类别标签：

0: 阿尔及利亚
1: 埃及
2: 伊拉克
3: 约旦
4: 摩洛哥
5: 沙特阿拉伯
6: 苏丹
7: 叙利亚
8: 突尼斯
9: 也门

应用

阿拉伯方言分类音频数据集可用于以下应用：

阿拉伯方言分类
阿拉伯口语识别
阿拉伯方言语音分析和理解
阿拉伯方言声学建模
跨方言语音处理和合成

5,000+

优质数据集

54 个

任务类型

进入经典数据集