five

mythicinfinity/libritts_r

收藏
Hugging Face2024-02-09 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mythicinfinity/libritts_r
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-to-speech language: - en size_categories: - 10K<n<100K configs: - config_name: dev data_files: - split: dev.clean path: "data/dev.clean/dev.clean*.parquet" - config_name: clean data_files: - split: dev.clean path: "data/dev.clean/dev.clean*.parquet" - split: test.clean path: "data/test.clean/test.clean*.parquet" - split: train.clean.100 path: "data/train.clean.100/train.clean.100*.parquet" - split: train.clean.360 path: "data/train.clean.360/train.clean.360*.parquet" - config_name: other data_files: - split: dev.other path: "data/dev.other/dev.other*.parquet" - split: test.other path: "data/test.other/test.other*.parquet" - split: train.other.500 path: "data/train.other.500/train.other.500*.parquet" - config_name: all data_files: - split: dev.clean path: "data/dev.clean/dev.clean*.parquet" - split: dev.other path: "data/dev.other/dev.other*.parquet" - split: test.clean path: "data/test.clean/test.clean*.parquet" - split: test.other path: "data/test.other/test.other*.parquet" - split: train.clean.100 path: "data/train.clean.100/train.clean.100*.parquet" - split: train.clean.360 path: "data/train.clean.360/train.clean.360*.parquet" - split: train.other.500 path: "data/train.other.500/train.other.500*.parquet" --- # Dataset Card for LibriTTS-R <!-- Provide a quick summary of the dataset. --> LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus (http://www.openslr.org/60/) which is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, published in 2019. ## Overview This is the LibriTTS-R dataset, adapted for the `datasets` library. ## Usage ### Splits There are 7 splits (dots replace dashes from the original dataset, to comply with hf naming requirements): - dev.clean - dev.other - test.clean - test.other - train.clean.100 - train.clean.360 - train.other.500 ### Configurations There are 3 configurations, each which limits the splits the `load_dataset()` function will download. The default configuration is "all". - "dev": only the "dev.clean" split (good for testing the dataset quickly) - "clean": contains only "clean" splits - "other": contains only "other" splits - "all": contains only "all" splits ### Example Loading the `clean` config with only the `train.clean.360` split. ``` load_dataset("blabble-io/libritts_r", "clean", split="train.clean.100") ``` Streaming is also supported. ``` load_dataset("blabble-io/libritts_r", streaming=True) ``` ### Columns ``` { "audio": datasets.Audio(sampling_rate=24_000), "text_normalized": datasets.Value("string"), "text_original": datasets.Value("string"), "speaker_id": datasets.Value("string"), "path": datasets.Value("string"), "chapter_id": datasets.Value("string"), "id": datasets.Value("string"), } ``` ### Example Row ``` { 'audio': { 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'array': ..., 'sampling_rate': 24000 }, 'text_normalized': 'How quickly he disappeared!"', 'text_original': 'How quickly he disappeared!"', 'speaker_id': '3081', 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'chapter_id': '166546', 'id': '3081_166546_000028_000002' } ``` ## Dataset Details ### Dataset Description - **License:** CC BY 4.0 ### Dataset Sources [optional] <!-- Provide the basic links for the dataset. --> - **Homepage:** https://www.openslr.org/141/ - **Paper:** https://arxiv.org/abs/2305.18802 ## Citation <!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. --> ``` @ARTICLE{Koizumi2023-hs, title = "{LibriTTS-R}: A restored multi-speaker text-to-speech corpus", author = "Koizumi, Yuma and Zen, Heiga and Karita, Shigeki and Ding, Yifan and Yatabe, Kohei and Morioka, Nobuyuki and Bacchiani, Michiel and Zhang, Yu and Han, Wei and Bapna, Ankur", abstract = "This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. Experimental results show that the LibriTTS-R ground-truth samples showed significantly improved sound quality compared to those in LibriTTS. In addition, neural end-to-end TTS trained with LibriTTS-R achieved speech naturalness on par with that of the ground-truth samples. The corpus is freely available for download from \textbackslashurl\{http://www.openslr.org/141/\}.", month = may, year = 2023, copyright = "http://creativecommons.org/licenses/by-nc-nd/4.0/", archivePrefix = "arXiv", primaryClass = "eess.AS", eprint = "2305.18802" } ```
提供机构:
mythicinfinity
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作