five

LocalDoc/fleurs-azerbaijani-asr

收藏
Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/LocalDoc/fleurs-azerbaijani-asr
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - az license: cc-by-4.0 task_categories: - automatic-speech-recognition tags: - azerbaijani - asr - speech - fleurs - benchmark pretty_name: FLEURS Azerbaijani ASR Benchmark --- # FLEURS Azerbaijani ASR Benchmark Azerbaijani (az_az) subset of [FLEURS](https://huggingface.co/datasets/google/fleurs), reformatted for ASR benchmarking and fine-tuning. ## Source Based on **FLEURS** dataset by Google ([Conneau et al., 2022](https://arxiv.org/abs/2205.12446)). Licensed under **CC-BY-4.0**. ## Structure | Split | Samples | Duration | |-------|---------|----------| | train | 2656 | 9.28h | | dev | 400 | 1.35h | | test | 921 | 3.23h | ## Fields - `audio` — 16kHz mono WAV - `sentence` — transcription (original casing and punctuation) - `sentence_normalized` — normalized (lowercase, no punctuation) - `gender` — male / female - `duration_seconds` — audio duration ## Usage ```python from datasets import load_dataset ds = load_dataset("LocalDoc/fleurs-azerbaijani-asr") # Benchmark test = ds["test"] # Training train = ds["train"] ``` ## Citation ```bibtex @article{fleurs2022arxiv, title={FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech}, author={Conneau, Alexis and Ma, Min and Khanuja, Simran and Zhang, Yu and Axelrod, Vera and Dalmia, Siddharth and Riesa, Jason and Rivera, Clara and Bapna, Ankur}, journal={arXiv preprint arXiv:2205.12446}, year={2022}, } ```
提供机构:
LocalDoc
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作