five

BadiniSpeechNLP/fleurs-badini

收藏
Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/BadiniSpeechNLP/fleurs-badini
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ku - en license: cc-by-nc-nd-4.0 pretty_name: FLEURS-Badini size_categories: - 1K<n<10K configs: - config_name: default --- # FLEURS-Badini ## Dataset Summary FLEURS-Badini is a speech dataset for the **Badini dialect of Northern Kurdish**, designed for research in: - Automatic Speech Recognition (ASR) - Speech-to-Text Translation (S2TT) It is a dialect-specific extension of the FLEURS benchmark, providing aligned **speech–text–translation** data for a low-resource language variant. The dataset contains **5,224 utterances (~15h40m)** recorded from **45 speakers**. --- ## Supported Tasks - Automatic Speech Recognition (ASR) - Speech-to-Text Translation (S2TT) --- ## Languages - Kurdish (Badini dialect, Arabic-based script) - English --- ## Dataset Structure ### Data Instances Each example contains: ```json { "audio": "path/to/audio.wav", "segment_name": "xxx.wav", "speaker": "spk_xxxxx", "kurdish": "...", "english": "..." }
提供机构:
BadiniSpeechNLP
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作