datadriven-company/TTS-German
收藏Hugging Face2026-03-13 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/datadriven-company/TTS-German
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- de
license: cc-by-4.0
task_categories:
- text-to-speech
- automatic-speech-recognition
pretty_name: TTS-German
size_categories:
- 100K<n<1M
tags:
- audio
- speech
- tts
- audiobooks
- processed
---
# TTS-German
High-quality German speech dataset for TTS and ASR, derived from **[CML-TTS German](https://huggingface.co/datasets/cmu-lti/cml-tts)**.
## Processing Pipeline
1. Standardize → 24kHz mono WAV, loudness normalize
2. Transcribe → WhisperX word-level timestamps
3. Segment → ≤12s at word boundaries
4. Denoise → DeepFilterNet
5. Quality filter → DNSMOS ≥ 2.5
6. G2P → IPA phonemes (custom dictionary)
## Statistics
| Metric | Value |
|--------|-------|
| Samples | 670,509 |
| Hours | 1250h |
| Sample rate | 24kHz mono |
| Max duration | 12s |
## Schema
| Column | Type | Description |
|--------|------|-------------|
| `__key__` | string | Unique ID |
| `audio` | Audio (24kHz FLAC) | Lossless audio |
| `text` | string | Transcript |
| `ipa` | string | IPA phonemes |
| `language` | string | Language code |
| `speaker_id` | string | Speaker identifier |
| `gender` | string | `male` / `female` / `unknown` |
| `dnsmos` | float | Quality score (1–5) |
## Usage
```python
from datasets import load_dataset
ds = load_dataset("datadriven-company/TTS-German", split="train")
sample = ds[0]
print(sample["text"]) # transcript
print(sample["ipa"]) # IPA phonemes
# sample["audio"] → {"array": np.ndarray, "sampling_rate": 24000}
```
## License
cc-by-4.0 — derived from CML-TTS German.
提供机构:
datadriven-company



