ylacombe/mls-eng-10k-descriptions-10k-v4
收藏Hugging Face2024-06-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ylacombe/mls-eng-10k-descriptions-10k-v4
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: original_path
dtype: string
- name: begin_time
dtype: float64
- name: end_time
dtype: float64
- name: original_text
dtype: string
- name: audio_duration
dtype: float64
- name: speaker_id
dtype: string
- name: book_id
dtype: string
- name: snr
dtype: float32
- name: c50
dtype: float32
- name: speech_duration
dtype: float64
- name: speaking_rate
dtype: string
- name: phonemes
dtype: string
- name: stoi
dtype: float64
- name: si-sdr
dtype: float64
- name: pesq
dtype: float64
- name: gender
dtype: string
- name: utterance_pitch_mean
dtype: float64
- name: utterance_pitch_std
dtype: float64
- name: pitch
dtype: string
- name: noise
dtype: string
- name: reverberation
dtype: string
- name: speech_monotony
dtype: string
- name: sdr_noise
dtype: string
- name: pesq_speech_quality
dtype: string
- name: text_description
dtype: string
- name: text
dtype: string
splits:
- name: dev
num_bytes: 4793931
num_examples: 3807
- name: test
num_bytes: 4741078
num_examples: 3769
- name: train
num_bytes: 3020619415
num_examples: 2420047
download_size: 1571191914
dataset_size: 3030154424
configs:
- config_name: default
data_files:
- split: dev
path: data/dev-*
- split: test
path: data/test-*
- split: train
path: data/train-*
---
The dataset includes various audio-related features such as the path of the audio file, begin and end times, original text, audio duration, speaker ID, book ID, signal-to-noise ratio, speech duration, speaking rate, phonemes, STOI, SI-SDR, PESQ, gender, pitch, noise, reverberation, speech monotony, SDR noise, PESQ speech quality, text description, and text. The dataset is divided into dev, test, and train sets with 3807, 3769, and 2420047 samples respectively. The total download size is 1571191914 bytes, and the actual size is 3030154424 bytes.
提供机构:
ylacombe
原始信息汇总
数据集概述
特征信息
数据集包含以下特征:
- original_path: 字符串类型
- begin_time: 浮点数类型
- end_time: 浮点数类型
- original_text: 字符串类型
- audio_duration: 浮点数类型
- speaker_id: 字符串类型
- book_id: 字符串类型
- snr: 浮点数类型
- c50: 浮点数类型
- speech_duration: 浮点数类型
- speaking_rate: 字符串类型
- phonemes: 字符串类型
- stoi: 浮点数类型
- si-sdr: 浮点数类型
- pesq: 浮点数类型
- gender: 字符串类型
- utterance_pitch_mean: 浮点数类型
- utterance_pitch_std: 浮点数类型
- pitch: 字符串类型
- noise: 字符串类型
- reverberation: 字符串类型
- speech_monotony: 字符串类型
- sdr_noise: 字符串类型
- pesq_speech_quality: 字符串类型
- text_description: 字符串类型
- text: 字符串类型
数据分割
数据集分为以下几个部分:
- dev: 4793931 字节,3807 个样本
- test: 4741078 字节,3769 个样本
- train: 3020619415 字节,2420047 个样本
数据集大小
- 下载大小: 1571191914 字节
- 数据集大小: 3030154424 字节
配置信息
- config_name: default
- data_files:
- dev: data/dev-*
- test: data/test-*
- train: data/train-*
- data_files:



