benkum/tts-general-benchmark
收藏Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/benkum/tts-general-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: TTS General Benchmark
language:
- en
- hi
- bn
- ta
- te
- kn
- ml
- mr
- gu
- od
- pa
license: other
task_categories:
- text-to-speech
size_categories:
- 1K<n<10K
tags:
- tts
- benchmark
- indian-languages
- telephony
- evaluation
- multilingual
---
# TTS General Benchmark
A multilingual Text-to-Speech (TTS) evaluation benchmark covering 11 Indian languages across multiple real-world use cases. The dataset is designed for systematic and repeatable evaluation of TTS systems under both high-quality and telephony bandwidth conditions.
**Total prompts:** 1,815 unique text samples
**Languages:** 11
**Evaluation tracks:** High Quality + 8 kHz Telephony
> This is an **evaluation-only benchmark dataset** intended for testing and comparison — not for model training.
---
# Dataset Overview
The TTS General Benchmark provides diverse prompts that reflect practical deployment scenarios such as conversational agents, announcements, narration, support calls, and telephony bots. Prompts are curated to test clarity, robustness, pronunciation handling, and expressive capability.
Each prompt is labeled with:
- language
- usecase
- eval_category (evaluation track)
The dataset contains **two independently evaluated tracks** with different prompt distributions.
---
# Evaluation Categories
## high_quality
Full-band prompts intended for studio / wideband TTS evaluation. These focus on naturalness, expressiveness, and content realism.
### High Quality Use Cases
| Use Case | Samples |
|----------|----------|
| Conversational Bots | 275 |
| Audiobook | 132 |
| Information Narration / News | 121 |
| General Conversations | 110 |
| Education | 110 |
| AI Assistants | 110 |
| Content Creation | 110 |
| Culture | 77 |
| Announcements | 110 |
| Indianisms | 55 |
| Insane Repetition | 55 |
**High-quality total:** 1,265
---
## 8khz_telephony
Narrowband prompts designed for telephony and call-center evaluation (8 kHz playback target). These measure intelligibility, clarity, and robustness under bandwidth constraints.
### Telephony Use Cases
| Use Case | Samples |
|----------|----------|
| collections | 110 |
| edge_cases | 110 |
| sales_bot | 110 |
| support | 110 |
| survey_bot | 110 |
**Telephony total:** 550
---
# Supported Languages
| Language | Code |
|-----------|--------|
English | en |
Hindi | hi |
Bengali | bn |
Tamil | ta |
Telugu | te |
Kannada | kn |
Malayalam | ml |
Marathi | mr |
Gujarati | gu |
Odia | od |
Punjabi | pa |
Language coverage is shared across both evaluation tracks.
---
# Dataset Structure
Each JSONL row contains:
```json
{
"text": "The text to be synthesized",
"language": "hi",
"usecase": "Conversational Bots",
"eval_category": "high_quality"
}
提供机构:
benkum



