BadiniSpeechNLP/fleurs-badini
收藏Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/BadiniSpeechNLP/fleurs-badini
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ku
- en
license: cc-by-nc-nd-4.0
pretty_name: FLEURS-Badini
size_categories:
- 1K<n<10K
configs:
- config_name: default
---
# FLEURS-Badini
## Dataset Summary
FLEURS-Badini is a speech dataset for the **Badini dialect of Northern Kurdish**, designed for research in:
- Automatic Speech Recognition (ASR)
- Speech-to-Text Translation (S2TT)
It is a dialect-specific extension of the FLEURS benchmark, providing aligned **speech–text–translation** data for a low-resource language variant.
The dataset contains **5,224 utterances (~15h40m)** recorded from **45 speakers**.
---
## Supported Tasks
- Automatic Speech Recognition (ASR)
- Speech-to-Text Translation (S2TT)
---
## Languages
- Kurdish (Badini dialect, Arabic-based script)
- English
---
## Dataset Structure
### Data Instances
Each example contains:
```json
{
"audio": "path/to/audio.wav",
"segment_name": "xxx.wav",
"speaker": "spk_xxxxx",
"kurdish": "...",
"english": "..."
}
提供机构:
BadiniSpeechNLP



