libritts_r

Name: libritts_r
Creator: maas
Published: 2026-05-19 15:17:55
License: 暂无描述

魔搭社区2026-05-19 更新2025-04-19 收录

下载链接：

https://modelscope.cn/datasets/pengzhendong/libritts_r

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for LibriTTS-R  LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus (http://www.openslr.org/60/) which is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, published in 2019. ## Overview This is the LibriTTS-R dataset, adapted for the `datasets` library. ## Usage ### Splits There are 7 splits (dots replace dashes from the original dataset, to comply with hf naming requirements): - dev.clean - dev.other - test.clean - test.other - train.clean.100 - train.clean.360 - train.other.500 ### Configurations There are 3 configurations, each which limits the splits the `load_dataset()` function will download. The default configuration is "all". - "dev": only the "dev.clean" split (good for testing the dataset quickly) - "clean": contains only "clean" splits - "other": contains only "other" splits - "all": contains only "all" splits ### Example Loading the `clean` config with only the `train.clean.360` split. ``` load_dataset("blabble-io/libritts_r", "clean", split="train.clean.100") ``` Streaming is also supported. ``` load_dataset("blabble-io/libritts_r", streaming=True) ``` ### Columns ``` { "audio": datasets.Audio(sampling_rate=24_000), "text_normalized": datasets.Value("string"), "text_original": datasets.Value("string"), "speaker_id": datasets.Value("string"), "path": datasets.Value("string"), "chapter_id": datasets.Value("string"), "id": datasets.Value("string"), } ``` ### Example Row ``` { 'audio': { 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'array': ..., 'sampling_rate': 24000 }, 'text_normalized': 'How quickly he disappeared!"', 'text_original': 'How quickly he disappeared!"', 'speaker_id': '3081', 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'chapter_id': '166546', 'id': '3081_166546_000028_000002' } ``` ## Dataset Details ### Dataset Description - **License:** CC BY 4.0 ### Dataset Sources [optional]  - **Homepage:** https://www.openslr.org/141/ - **Paper:** https://arxiv.org/abs/2305.18802 ## Citation  ``` @ARTICLE{Koizumi2023-hs, title = "{LibriTTS-R}: A restored multi-speaker text-to-speech corpus", author = "Koizumi, Yuma and Zen, Heiga and Karita, Shigeki and Ding, Yifan and Yatabe, Kohei and Morioka, Nobuyuki and Bacchiani, Michiel and Zhang, Yu and Han, Wei and Bapna, Ankur", abstract = "This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. Experimental results show that the LibriTTS-R ground-truth samples showed significantly improved sound quality compared to those in LibriTTS. In addition, neural end-to-end TTS trained with LibriTTS-R achieved speech naturalness on par with that of the ground-truth samples. The corpus is freely available for download from \textbackslashurl\{http://www.openslr.org/141/\}.", month = may, year = 2023, copyright = "http://creativecommons.org/licenses/by-nc-nd/4.0/", archivePrefix = "arXiv", primaryClass = "eess.AS", eprint = "2305.18802" } ```

# LibriTTS-R 数据集卡片  LibriTTS-R [1] 是LibriTTS语料库（http://www.openslr.org/60/）的音质增强版本，该语料库为2019年发布的多说话人英语语料库，包含约585小时的24kHz采样率朗读英语语音数据。 ## 概览本数据集为适配`datasets`库的LibriTTS-R数据集。 ## 使用方法 ### 数据拆分共有7种数据拆分（为符合Hugging Face命名规范，将原数据集中的横杠替换为点）： - dev.clean - dev.other - test.clean - test.other - train.clean.100 - train.clean.360 - train.other.500 ### 配置项共提供3种配置，每种配置会限制`load_dataset()`函数需下载的数据拆分。默认配置为`"all"`。 - `"dev"`：仅包含`"dev.clean"`拆分（适合快速测试数据集） - `"clean"`：仅包含所有`"clean"`类拆分 - `"other"`：仅包含所有`"other"`类拆分 - `"all"`：包含全部数据拆分 ### 示例加载仅包含`train.clean.360`拆分的`clean`配置的代码示例： load_dataset("blabble-io/libritts_r", "clean", split="train.clean.100") 同时支持流式加载： load_dataset("blabble-io/libritts_r", streaming=True) ### 数据字段 { "audio": datasets.Audio(sampling_rate=24_000), "text_normalized": datasets.Value("string"), "text_original": datasets.Value("string"), "speaker_id": datasets.Value("string"), "path": datasets.Value("string"), "chapter_id": datasets.Value("string"), "id": datasets.Value("string"), } ### 示例数据行 { 'audio': { 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'array': ..., 'sampling_rate': 24000 }, 'text_normalized': 'How quickly he disappeared!"', 'text_original': 'How quickly he disappeared!"', 'speaker_id': '3081', 'path': '/home/user/.cache/huggingface/datasets/downloads/extracted/5551a515e85b9e463062524539c2e1cb52ba32affe128dffd866db0205248bdd/LibriTTS_R/dev-clean/3081/166546/3081_166546_000028_000002.wav', 'chapter_id': '166546', 'id': '3081_166546_000028_000002' } ## 数据集详情 ### 数据集描述 - **授权协议**：CC BY 4.0 ### 数据集来源 [可选]  - **主页**：https://www.openslr.org/141/ - **论文**：https://arxiv.org/abs/2305.18802 ## 引用信息  @ARTICLE{Koizumi2023-hs, title = "{LibriTTS-R}: 一款经过音质修复的多说话人文本转语音语料库", author = "Koizumi, Yuma and Zen, Heiga and Karita, Shigeki and Ding, Yifan and Yatabe, Kohei and Morioka, Nobuyuki and Bacchiani, Michiel and Zhang, Yu and Han, Wei and Bapna, Ankur", abstract = "本文介绍了一款专为文本转语音（Text-to-Speech, TTS）场景设计的新型语音数据集LibriTTS-R。该数据集通过对LibriTTS语料库应用语音修复技术得到，后者包含来自2456名说话人的585小时24kHz采样率语音数据及对应文本。LibriTTS-R的样本组成与LibriTTS完全一致，仅音质得到提升。实验结果表明，相较于LibriTTS，LibriTTS-R的真实样本音质有显著提升。此外，基于LibriTTS-R训练的神经端到端TTS模型，其生成语音的自然度可媲美真实语音样本。该语料库可在https://www.openslr.org/141/免费下载。", month = may, year = 2023, copyright = "http://creativecommons.org/licenses/by-nc-nd/4.0/", archivePrefix = "arXiv", primaryClass = "eess.AS", eprint = "2305.18802" }

提供机构：

maas

创建时间：

2025-04-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集