five

intronhealth/afrispeech-dialog

收藏
Hugging Face2024-10-28 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/intronhealth/afrispeech-dialog
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-4.0 task_categories: - automatic-speech-recognition language: - en tags: - medical - africa --- # AfriSpeech-Dialog v1: A Conversational Speech Dataset for African Accents [![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa] This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa]. [![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa] [cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/ [cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png [cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg ### Overview and Purpose **AfriSpeech-Dialog** is a pan-African conversational speech dataset with 6 hours of recorded dialogue, designed to support speech recognition (ASR) and speaker diarization applications. Collected from diverse accents across Nigeria, Kenya, and South Africa, the dataset offers valuable insights into the varied linguistic and phonetic characteristics found in African-accented English. This release includes 50 conversations across both medical and general topics. #### Dataset Statistics | | Medical | General | |--------------------------|---------|---------| | **Counts** | 20 | 29 | | **Timestamped Counts** | 9 | 21 | | **Avg. Num. of Turns** | 78.6 | 30.55 | | **Total Duration (hrs)** | 2.07 | 4.93 | | **Avg. Word Count** | 725.3 | 1356.83 | | **Num. of Countries** | 1 | 3 | | **Num. of Accents** | 6 | 8 | | **Genders (M, F)** | (14,26) | (25,33) | ### Use Cases This dataset is tailored for use in: - Automatic Speech Recognition (ASR) fine-tuning - Speaker Diarization training and testing ### Dataset Composition - **Languages and Accents**: The dataset includes 11 accents: Hausa, Isoko, Idoma, Urhobo, Ijaw, Yoruba, Swahili, Sesotho, Igbo, Igala, and Ebira. - **Domains**: Conversations span two domains—20 medical conversations, simulating doctor-patient interactions, and 30 general-topic conversations. - **Participants**: The dataset includes both male and female speakers. - **Structure of Conversations**: Conversations are two-speaker free-form dialogues. ### Data Collection and Processing - **Collection Method**: Conversations were collected remotely across various acoustic environments as stored as `.wav` files. - **Annotation**: Each conversation is annotated with speaker labels and timestamps, including start and end times for each speaker’s turn. ### Key Columns and Fields - **file_name**: Path to the audio file. - **transcript**: Full transcript of the conversation with timestamps. - **domain**: Indicates the conversation type, either medical or general. - **duration**: Duration of the audio file, in seconds. - **age_group**: Age group of the speakers. - **accent**: Primary accent represented in the conversation. - **country**: Country of origin for the speakers. ### Usage Instructions **Accessing the Dataset**: The dataset can be accessed through Hugging Face: ```python from datasets import load_dataset afrispeech_dialog = load_dataset("intronhealth/afrispeech-dialog") ```
提供机构:
intronhealth
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作