davidguzmanr/CSS10-Multilingual-LJSpeech
收藏Hugging Face2026-03-22 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/davidguzmanr/CSS10-Multilingual-LJSpeech
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
configs:
- config_name: Chinese
data_files:
- split: train
path: Chinese/train-*
- split: test
path: Chinese/test-*
- config_name: Dutch
data_files:
- split: train
path: Dutch/train-*
- split: test
path: Dutch/test-*
- config_name: English
data_files:
- split: train
path: English/train-*
- split: test
path: English/test-*
- config_name: Finnish
data_files:
- split: train
path: Finnish/train-*
- split: test
path: Finnish/test-*
- config_name: French
data_files:
- split: train
path: French/train-*
- split: test
path: French/test-*
- config_name: German
data_files:
- split: train
path: German/train-*
- split: test
path: German/test-*
- config_name: Greek
data_files:
- split: train
path: Greek/train-*
- split: test
path: Greek/test-*
- config_name: Hungarian
data_files:
- split: train
path: Hungarian/train-*
- split: test
path: Hungarian/test-*
- config_name: Japanese
data_files:
- split: train
path: Japanese/train-*
- split: test
path: Japanese/test-*
- config_name: Russian
data_files:
- split: train
path: Russian/train-*
- split: test
path: Russian/test-*
- config_name: Spanish
data_files:
- split: train
path: Spanish/train-*
- split: test
path: Spanish/test-*
dataset_info:
- config_name: Chinese
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 1944771156.2
num_examples: 2822
- name: test
num_bytes: 105135924.0
num_examples: 149
download_size: 1513159976
dataset_size: 2049907080.2
- config_name: Dutch
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 3976183593.76
num_examples: 5833
- name: test
num_bytes: 219071344.0
num_examples: 307
download_size: 3426425250
dataset_size: 4195254937.76
- config_name: English
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 3651818726.04
num_examples: 12429
- name: test
num_bytes: 186727739.0
num_examples: 655
download_size: 3781122874
dataset_size: 3838546465.04
- config_name: Finnish
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 3206955578.428
num_examples: 4517
- name: test
num_bytes: 166603230.0
num_examples: 238
download_size: 2582928898
dataset_size: 3373558808.428
- config_name: French
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 5850027657.96
num_examples: 8216
- name: test
num_bytes: 302349858.0
num_examples: 433
download_size: 4695881763
dataset_size: 6152377515.96
- config_name: German
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 4759574507.8
num_examples: 7055
- name: test
num_bytes: 254619376.0
num_examples: 372
download_size: 4019246531
dataset_size: 5014193883.8
- config_name: Greek
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 1259066946.24
num_examples: 1751
- name: test
num_bytes: 62855516.0
num_examples: 93
download_size: 962249671
dataset_size: 1321922462.24
- config_name: Hungarian
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 3022096878.68
num_examples: 4288
- name: test
num_bytes: 154210274.0
num_examples: 226
download_size: 2456347689
dataset_size: 3176307152.68
- config_name: Japanese
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 4472783739.8
num_examples: 6496
- name: test
num_bytes: 234866167.0
num_examples: 342
download_size: 3498101711
dataset_size: 4707649906.8
- config_name: Russian
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 6292411768.12
num_examples: 9118
- name: test
num_bytes: 338147442.0
num_examples: 480
download_size: 5028653756
dataset_size: 6630559210.12
- config_name: Spanish
features:
- name: audio
dtype:
audio:
sampling_rate: 22050
- name: text
dtype: string
- name: normalized_text
dtype: string
- name: duration
dtype: float64
splits:
- name: train
num_bytes: 7116702663.6
num_examples: 10465
- name: test
num_bytes: 382346422.0
num_examples: 551
download_size: 5770590547
dataset_size: 7499049085.6
---
# CSS10-Multilingual-LJSpeech
Multilingual speech dataset combining LJSpeech (English) + CSS10 (10 languages) in a consistent LJSpeech format.
## Dataset Description
This dataset merges:
- **LJSpeech**: High-quality English speech dataset
- **CSS10**: A collection of single-speaker speech datasets for 10 languages
All audio files are provided in a consistent format suitable for TTS training.
### Features
Each sample contains:
- **audio**: Waveform audio sampled at 22,050 Hz
- **text**: Original transcription
- **normalized_text**: Normalized version of the transcription
- **duration**: Audio duration in seconds
A train-test split has been added for each language, where 5% of the data is reserved for testing. This split enables computation of objective metrics for TTS model evaluation.
## Citation
If you use this dataset, please cite both original sources:
```bibtex
@misc{ljspeech17,
author = {Keith Ito and Linda Johnson},
title = {The LJ Speech Dataset},
howpublished = {\url{https://keithito.com/LJ-Speech-Dataset/}},
year = {2017}
}
@misc{park2019css10collectionsinglespeaker,
title={CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages},
author={Kyubyong Park and Thomas Mulc},
year={2019},
eprint={1903.11269},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/1903.11269},
}
```
提供机构:
davidguzmanr



