mythicinfinity/libritts_r

Name: mythicinfinity/libritts_r
Creator: mythicinfinity
Published: 2024-02-09 21:20:19
License: 暂无描述

Hugging Face2024-02-09 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/mythicinfinity/libritts_r

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 task_categories: - text-to-speech language: - en size_categories: - 10K<n<100K configs: - config_name: dev data_files: - split: dev.clean path: "data/dev.clean/dev.clean*.parquet" - config_name: clean data_files: - split: dev.clean path: "data/dev.clean/dev.clean*.parquet" - split: test.clean path: "data/test.clean/test.clean*.parquet" - split: train.clean.100 path: "data/train.clean.100/train.clean.100*.parquet" - split: train.clean.360 path: "data/train.clean.360/train.clean.360*.parquet" - config_name: other data_files: - split: dev.other path: "data/dev.other/dev.other*.parquet" - split: test.other path: "data/test.other/test.other*.parquet" - split: train.other.500 path: "data/train.other.500/train.other.500*.parquet" - config_name: all data_files: - split: dev.clean path: "data/dev.clean/dev.clean*.parquet" - split: dev.other path: "data/dev.other/dev.other*.parquet" - split: test.clean path: "data/test.clean/test.clean*.parquet" - split: test.other path: "data/test.other/test.other*.parquet" - split: train.clean.100 path: "data/train.clean.100/train.clean.100*.parquet" - split: train.clean.360 path: "data/train.clean.360/train.clean.360*.parquet" - split: train.other.500 path: "data/train.other.500/train.other.500*.parquet" --- # Dataset Card for LibriTTS-R  LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus (http://www.openslr.org/60/) which is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, published in 2019. ## Overview This is the LibriTTS-R dataset, adapted for the `datasets` library. ## Usage ### Splits There are 7 splits (dots replace dashes from the original dataset, to comply with hf naming requirements): - dev.clean - dev.other - test.clean - test.other - train.clean.100 - train.clean.360 - train.other.500 ### Configurations There are 3 configurations, each which limits the splits the `load_dataset()` function will download. The default configuration is "all". - "dev": only the "dev.clean" split (good for testing the dataset quickly) - "clean": contains only "clean" splits - "other": contains only "other" splits - "all": contains only "all" splits ### Example Loading the `clean` config with only the `train.clean.360` split. ``` load_dataset("blabble-io/libritts_r", "clean", split="train.clean.100") ``` Streaming is also supported. ``` load_dataset("blabble-io/libritts_r", streaming=True) ``` ### Columns ``` { "audio": datasets.Audio(sampling_rate=24_000), "text_normalized": datasets.Value("string"), "text_original": datasets.Value("string"), "speaker_id": datasets.Value("string"), "path": datasets.Value("string"), "chapter_id": datasets.Value("string"), "id": datasets.Value("string"), } ``` ### Example Row ``` { 'audio': { 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'array': ..., 'sampling_rate': 24000 }, 'text_normalized': 'How quickly he disappeared!"', 'text_original': 'How quickly he disappeared!"', 'speaker_id': '3081', 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'chapter_id': '166546', 'id': '3081_166546_000028_000002' } ``` ## Dataset Details ### Dataset Description - **License:** CC BY 4.0 ### Dataset Sources [optional]  - **Homepage:** https://www.openslr.org/141/ - **Paper:** https://arxiv.org/abs/2305.18802 ## Citation  ``` @ARTICLE{Koizumi2023-hs, title = "{LibriTTS-R}: A restored multi-speaker text-to-speech corpus", author = "Koizumi, Yuma and Zen, Heiga and Karita, Shigeki and Ding, Yifan and Yatabe, Kohei and Morioka, Nobuyuki and Bacchiani, Michiel and Zhang, Yu and Han, Wei and Bapna, Ankur", abstract = "This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. Experimental results show that the LibriTTS-R ground-truth samples showed significantly improved sound quality compared to those in LibriTTS. In addition, neural end-to-end TTS trained with LibriTTS-R achieved speech naturalness on par with that of the ground-truth samples. The corpus is freely available for download from \textbackslashurl\{http://www.openslr.org/141/\}.", month = may, year = 2023, copyright = "http://creativecommons.org/licenses/by-nc-nd/4.0/", archivePrefix = "arXiv", primaryClass = "eess.AS", eprint = "2305.18802" } ```

提供机构：

mythicinfinity

5,000+

优质数据集

54 个

任务类型

进入经典数据集