five

benkum/tts-general-benchmark

收藏
Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/benkum/tts-general-benchmark
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: TTS General Benchmark language: - en - hi - bn - ta - te - kn - ml - mr - gu - od - pa license: other task_categories: - text-to-speech size_categories: - 1K<n<10K tags: - tts - benchmark - indian-languages - telephony - evaluation - multilingual --- # TTS General Benchmark A multilingual Text-to-Speech (TTS) evaluation benchmark covering 11 Indian languages across multiple real-world use cases. The dataset is designed for systematic and repeatable evaluation of TTS systems under both high-quality and telephony bandwidth conditions. **Total prompts:** 1,815 unique text samples **Languages:** 11 **Evaluation tracks:** High Quality + 8 kHz Telephony > This is an **evaluation-only benchmark dataset** intended for testing and comparison — not for model training. --- # Dataset Overview The TTS General Benchmark provides diverse prompts that reflect practical deployment scenarios such as conversational agents, announcements, narration, support calls, and telephony bots. Prompts are curated to test clarity, robustness, pronunciation handling, and expressive capability. Each prompt is labeled with: - language - usecase - eval_category (evaluation track) The dataset contains **two independently evaluated tracks** with different prompt distributions. --- # Evaluation Categories ## high_quality Full-band prompts intended for studio / wideband TTS evaluation. These focus on naturalness, expressiveness, and content realism. ### High Quality Use Cases | Use Case | Samples | |----------|----------| | Conversational Bots | 275 | | Audiobook | 132 | | Information Narration / News | 121 | | General Conversations | 110 | | Education | 110 | | AI Assistants | 110 | | Content Creation | 110 | | Culture | 77 | | Announcements | 110 | | Indianisms | 55 | | Insane Repetition | 55 | **High-quality total:** 1,265 --- ## 8khz_telephony Narrowband prompts designed for telephony and call-center evaluation (8 kHz playback target). These measure intelligibility, clarity, and robustness under bandwidth constraints. ### Telephony Use Cases | Use Case | Samples | |----------|----------| | collections | 110 | | edge_cases | 110 | | sales_bot | 110 | | support | 110 | | survey_bot | 110 | **Telephony total:** 550 --- # Supported Languages | Language | Code | |-----------|--------| English | en | Hindi | hi | Bengali | bn | Tamil | ta | Telugu | te | Kannada | kn | Malayalam | ml | Marathi | mr | Gujarati | gu | Odia | od | Punjabi | pa | Language coverage is shared across both evaluation tracks. --- # Dataset Structure Each JSONL row contains: ```json { "text": "The text to be synthesized", "language": "hi", "usecase": "Conversational Bots", "eval_category": "high_quality" }
提供机构:
benkum
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作