five

ylacombe/libritts_r_tags_v2

收藏
Hugging Face2024-05-27 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ylacombe/libritts_r_tags_v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: clean features: - name: text dtype: string - name: text_original dtype: string - name: speaker_id dtype: string - name: path dtype: string - name: chapter_id dtype: string - name: id dtype: string - name: snr dtype: float32 - name: c50 dtype: float32 - name: speech_duration dtype: float64 - name: speaking_rate dtype: float64 - name: phonemes dtype: string - name: stoi dtype: float64 - name: si-sdr dtype: float64 - name: pesq dtype: float64 - name: gender dtype: string - name: utterance_pitch_std dtype: float64 - name: utterance_pitch_mean dtype: float64 splits: - name: dev.clean num_bytes: 3545454 num_examples: 5736 - name: test.clean num_bytes: 3143569 num_examples: 4837 - name: train.clean.100 num_bytes: 20737742 num_examples: 33232 - name: train.clean.360 num_bytes: 73773720 num_examples: 116426 download_size: 46024942 dataset_size: 101200485 - config_name: other features: - name: text dtype: string - name: text_original dtype: string - name: speaker_id dtype: string - name: path dtype: string - name: chapter_id dtype: string - name: id dtype: string - name: snr dtype: float32 - name: c50 dtype: float32 - name: speech_duration dtype: float64 - name: speaking_rate dtype: float64 - name: phonemes dtype: string - name: stoi dtype: float64 - name: si-sdr dtype: float64 - name: pesq dtype: float64 - name: gender dtype: string - name: utterance_pitch_std dtype: float64 - name: utterance_pitch_mean dtype: float64 splits: - name: dev.other num_bytes: 2708761 num_examples: 4613 - name: test.other num_bytes: 2923040 num_examples: 5120 - name: train.other.500 num_bytes: 124369605 num_examples: 205035 download_size: 58188840 dataset_size: 130001406 configs: - config_name: clean data_files: - split: dev.clean path: clean/dev.clean-* - split: test.clean path: clean/test.clean-* - split: train.clean.100 path: clean/train.clean.100-* - split: train.clean.360 path: clean/train.clean.360-* - config_name: other data_files: - split: dev.other path: other/dev.other-* - split: test.other path: other/test.other-* - split: train.other.500 path: other/train.other.500-* ---

The dataset includes two configurations: clean and other. Each configuration contains the following features: text, original text, speaker ID, path, chapter ID, ID, SNR, C50, speech duration, speaking rate, phonemes, STOI, SI-SDR, PESQ, gender, utterance pitch standard deviation, and utterance pitch mean. The dataset is divided into development, test, and training sets, each with specific byte counts and example counts. The total size and download size of the dataset are also detailed for each configuration.
提供机构:
ylacombe
原始信息汇总

数据集概述

配置 clean

特征

  • text: 字符串
  • text_original: 字符串
  • speaker_id: 字符串
  • path: 字符串
  • chapter_id: 字符串
  • id: 字符串
  • snr: 浮点数 (float32)
  • c50: 浮点数 (float32)
  • speech_duration: 浮点数 (float64)
  • speaking_rate: 浮点数 (float64)
  • phonemes: 字符串
  • stoi: 浮点数 (float64)
  • si-sdr: 浮点数 (float64)
  • pesq: 浮点数 (float64)
  • gender: 字符串
  • utterance_pitch_std: 浮点数 (float64)
  • utterance_pitch_mean: 浮点数 (float64)

数据分割

  • dev.clean:
    • 字节数: 3545454
    • 样本数: 5736
  • test.clean:
    • 字节数: 3143569
    • 样本数: 4837
  • train.clean.100:
    • 字节数: 20737742
    • 样本数: 33232
  • train.clean.360:
    • 字节数: 73773720
    • 样本数: 116426

数据大小

  • 下载大小: 46024942 字节
  • 数据集大小: 101200485 字节

配置 other

特征

  • text: 字符串
  • text_original: 字符串
  • speaker_id: 字符串
  • path: 字符串
  • chapter_id: 字符串
  • id: 字符串
  • snr: 浮点数 (float32)
  • c50: 浮点数 (float32)
  • speech_duration: 浮点数 (float64)
  • speaking_rate: 浮点数 (float64)
  • phonemes: 字符串
  • stoi: 浮点数 (float64)
  • si-sdr: 浮点数 (float64)
  • pesq: 浮点数 (float64)
  • gender: 字符串
  • utterance_pitch_std: 浮点数 (float64)
  • utterance_pitch_mean: 浮点数 (float64)

数据分割

  • dev.other:
    • 字节数: 2708761
    • 样本数: 4613
  • test.other:
    • 字节数: 2923040
    • 样本数: 5120
  • train.other.500:
    • 字节数: 124369605
    • 样本数: 205035

数据大小

  • 下载大小: 58188840 字节
  • 数据集大小: 130001406 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作