ylacombe/mls-eng-10k-descriptions-10k-v4

Name: ylacombe/mls-eng-10k-descriptions-10k-v4
Creator: ylacombe
Published: 2024-06-06 16:35:14
License: 暂无描述

Hugging Face2024-06-06 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/ylacombe/mls-eng-10k-descriptions-10k-v4

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: original_path dtype: string - name: begin_time dtype: float64 - name: end_time dtype: float64 - name: original_text dtype: string - name: audio_duration dtype: float64 - name: speaker_id dtype: string - name: book_id dtype: string - name: snr dtype: float32 - name: c50 dtype: float32 - name: speech_duration dtype: float64 - name: speaking_rate dtype: string - name: phonemes dtype: string - name: stoi dtype: float64 - name: si-sdr dtype: float64 - name: pesq dtype: float64 - name: gender dtype: string - name: utterance_pitch_mean dtype: float64 - name: utterance_pitch_std dtype: float64 - name: pitch dtype: string - name: noise dtype: string - name: reverberation dtype: string - name: speech_monotony dtype: string - name: sdr_noise dtype: string - name: pesq_speech_quality dtype: string - name: text_description dtype: string - name: text dtype: string splits: - name: dev num_bytes: 4793931 num_examples: 3807 - name: test num_bytes: 4741078 num_examples: 3769 - name: train num_bytes: 3020619415 num_examples: 2420047 download_size: 1571191914 dataset_size: 3030154424 configs: - config_name: default data_files: - split: dev path: data/dev-* - split: test path: data/test-* - split: train path: data/train-* ---

The dataset includes various audio-related features such as the path of the audio file, begin and end times, original text, audio duration, speaker ID, book ID, signal-to-noise ratio, speech duration, speaking rate, phonemes, STOI, SI-SDR, PESQ, gender, pitch, noise, reverberation, speech monotony, SDR noise, PESQ speech quality, text description, and text. The dataset is divided into dev, test, and train sets with 3807, 3769, and 2420047 samples respectively. The total download size is 1571191914 bytes, and the actual size is 3030154424 bytes.

提供机构：

ylacombe

原始信息汇总

数据集概述

特征信息

数据集包含以下特征：

original_path: 字符串类型
begin_time: 浮点数类型
end_time: 浮点数类型
original_text: 字符串类型
audio_duration: 浮点数类型
speaker_id: 字符串类型
book_id: 字符串类型
snr: 浮点数类型
c50: 浮点数类型
speech_duration: 浮点数类型
speaking_rate: 字符串类型
phonemes: 字符串类型
stoi: 浮点数类型
si-sdr: 浮点数类型
pesq: 浮点数类型
gender: 字符串类型
utterance_pitch_mean: 浮点数类型
utterance_pitch_std: 浮点数类型
pitch: 字符串类型
noise: 字符串类型
reverberation: 字符串类型
speech_monotony: 字符串类型
sdr_noise: 字符串类型
pesq_speech_quality: 字符串类型
text_description: 字符串类型
text: 字符串类型

数据分割

数据集分为以下几个部分：

dev: 4793931 字节，3807 个样本
test: 4741078 字节，3769 个样本
train: 3020619415 字节，2420047 个样本

数据集大小

下载大小: 1571191914 字节
数据集大小: 3030154424 字节

配置信息

config_name: default
- data_files:
  - dev: data/dev-*
  - test: data/test-*
  - train: data/train-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集