five

davidguzmanr/CSS10-Multilingual-LJSpeech

收藏
Hugging Face2026-03-22 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/davidguzmanr/CSS10-Multilingual-LJSpeech
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 configs: - config_name: Chinese data_files: - split: train path: Chinese/train-* - split: test path: Chinese/test-* - config_name: Dutch data_files: - split: train path: Dutch/train-* - split: test path: Dutch/test-* - config_name: English data_files: - split: train path: English/train-* - split: test path: English/test-* - config_name: Finnish data_files: - split: train path: Finnish/train-* - split: test path: Finnish/test-* - config_name: French data_files: - split: train path: French/train-* - split: test path: French/test-* - config_name: German data_files: - split: train path: German/train-* - split: test path: German/test-* - config_name: Greek data_files: - split: train path: Greek/train-* - split: test path: Greek/test-* - config_name: Hungarian data_files: - split: train path: Hungarian/train-* - split: test path: Hungarian/test-* - config_name: Japanese data_files: - split: train path: Japanese/train-* - split: test path: Japanese/test-* - config_name: Russian data_files: - split: train path: Russian/train-* - split: test path: Russian/test-* - config_name: Spanish data_files: - split: train path: Spanish/train-* - split: test path: Spanish/test-* dataset_info: - config_name: Chinese features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 1944771156.2 num_examples: 2822 - name: test num_bytes: 105135924.0 num_examples: 149 download_size: 1513159976 dataset_size: 2049907080.2 - config_name: Dutch features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 3976183593.76 num_examples: 5833 - name: test num_bytes: 219071344.0 num_examples: 307 download_size: 3426425250 dataset_size: 4195254937.76 - config_name: English features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 3651818726.04 num_examples: 12429 - name: test num_bytes: 186727739.0 num_examples: 655 download_size: 3781122874 dataset_size: 3838546465.04 - config_name: Finnish features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 3206955578.428 num_examples: 4517 - name: test num_bytes: 166603230.0 num_examples: 238 download_size: 2582928898 dataset_size: 3373558808.428 - config_name: French features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 5850027657.96 num_examples: 8216 - name: test num_bytes: 302349858.0 num_examples: 433 download_size: 4695881763 dataset_size: 6152377515.96 - config_name: German features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 4759574507.8 num_examples: 7055 - name: test num_bytes: 254619376.0 num_examples: 372 download_size: 4019246531 dataset_size: 5014193883.8 - config_name: Greek features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 1259066946.24 num_examples: 1751 - name: test num_bytes: 62855516.0 num_examples: 93 download_size: 962249671 dataset_size: 1321922462.24 - config_name: Hungarian features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 3022096878.68 num_examples: 4288 - name: test num_bytes: 154210274.0 num_examples: 226 download_size: 2456347689 dataset_size: 3176307152.68 - config_name: Japanese features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 4472783739.8 num_examples: 6496 - name: test num_bytes: 234866167.0 num_examples: 342 download_size: 3498101711 dataset_size: 4707649906.8 - config_name: Russian features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 6292411768.12 num_examples: 9118 - name: test num_bytes: 338147442.0 num_examples: 480 download_size: 5028653756 dataset_size: 6630559210.12 - config_name: Spanish features: - name: audio dtype: audio: sampling_rate: 22050 - name: text dtype: string - name: normalized_text dtype: string - name: duration dtype: float64 splits: - name: train num_bytes: 7116702663.6 num_examples: 10465 - name: test num_bytes: 382346422.0 num_examples: 551 download_size: 5770590547 dataset_size: 7499049085.6 --- # CSS10-Multilingual-LJSpeech Multilingual speech dataset combining LJSpeech (English) + CSS10 (10 languages) in a consistent LJSpeech format. ## Dataset Description This dataset merges: - **LJSpeech**: High-quality English speech dataset - **CSS10**: A collection of single-speaker speech datasets for 10 languages All audio files are provided in a consistent format suitable for TTS training. ### Features Each sample contains: - **audio**: Waveform audio sampled at 22,050 Hz - **text**: Original transcription - **normalized_text**: Normalized version of the transcription - **duration**: Audio duration in seconds A train-test split has been added for each language, where 5% of the data is reserved for testing. This split enables computation of objective metrics for TTS model evaluation. ## Citation If you use this dataset, please cite both original sources: ```bibtex @misc{ljspeech17, author = {Keith Ito and Linda Johnson}, title = {The LJ Speech Dataset}, howpublished = {\url{https://keithito.com/LJ-Speech-Dataset/}}, year = {2017} } @misc{park2019css10collectionsinglespeaker, title={CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages}, author={Kyubyong Park and Thomas Mulc}, year={2019}, eprint={1903.11269}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/1903.11269}, } ```
提供机构:
davidguzmanr
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作