JST-SUPERB/MUSAN-speech_unit_part1
收藏Hugging Face2024-07-10 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/JST-SUPERB/MUSAN-speech_unit_part1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个音频分割,每个分割对应不同的音频质量和采样率。数据集的特征包括语音输入、不同信噪比下的噪声音频转录、干净音频的转录以及音频单元序列等。数据集的下载大小为9300844417字节,总大小为16579839734.800001字节。
This dataset contains multiple audio splits, each corresponding to different audio qualities and sampling rates. The features of the dataset include speech input, transcriptions of noisy audio at different signal-to-noise ratios, transcriptions of clean audio, and audio unit sequences. The download size of the dataset is 9300844417 bytes, and the total size is 16579839734.800001 bytes.
提供机构:
JST-SUPERB
原始信息汇总
数据集概述
数据集名称
MUSAN-speech_unit_part1
数据集配置
- config_name: default
- data_files:
- split: academicodec_hifi_16k_320d
- path: data/academicodec_hifi_16k_320d-*
- split: academicodec_hifi_16k_320d_large_uni
- path: data/academicodec_hifi_16k_320d_large_uni-*
- split: academicodec_hifi_24k_320d
- path: data/academicodec_hifi_24k_320d-*
- split: audiodec_24k_320d
- path: data/audiodec_24k_320d-*
- split: dac_16k
- path: data/dac_16k-*
- split: dac_24k
- path: data/dac_24k-*
- split: dac_44k
- path: data/dac_44k-*
- split: speech_tokenizer_16k
- path: data/speech_tokenizer_16k-*
- split: academicodec_hifi_16k_320d
- data_files:
数据集特征
- name: speech_input
- dtype: string
- name: noisy_-20dB
- dtype: audio
- name: noisy_10dB_transcription_whisper-small.en
- dtype: string
- name: noisy_5dB_transcription_whisper-small.en
- dtype: string
- name: noisy_0dB_transcription_whisper-small.en
- dtype: string
- name: noisy_-5dB_transcription_whisper-small.en
- dtype: string
- name: noisy_-10dB_transcription_whisper-small.en
- dtype: string
- name: noisy_10dB_transcription_whisper-medium.en
- dtype: string
- name: noisy_5dB_transcription_whisper-medium.en
- dtype: string
- name: noisy_0dB_transcription_whisper-medium.en
- dtype: string
- name: noisy_-5dB_transcription_whisper-medium.en
- dtype: string
- name: noisy_-10dB_transcription_whisper-medium.en
- dtype: string
- name: noisy_10dB_transcription_whisper-large-v3
- dtype: string
- name: noisy_5dB_transcription_whisper-large-v3
- dtype: string
- name: noisy_0dB_transcription_whisper-large-v3
- dtype: string
- name: noisy_-5dB_transcription_whisper-large-v3
- dtype: string
- name: noisy_-10dB_transcription_whisper-large-v3
- dtype: string
- name: output
- dtype: string
- name: clean_audio_transcription_whisper-small.en
- dtype: string
- name: clean_audio_transcription_whisper-medium.en
- dtype: string
- name: clean_audio_transcription_whisper-large-v3
- dtype: string
- name: clean_audio_unit
- sequence:
- sequence: int64
- sequence:
- name: noisy_10dB_unit
- sequence:
- sequence: int64
- sequence:
- name: noisy_5dB_unit
- sequence:
- sequence: int64
- sequence:
- name: noisy_0dB_unit
- sequence:
- sequence: int64
- sequence:
- name: noisy_-5dB_unit
- sequence:
- sequence: int64
- sequence:
- name: noisy_-10dB_unit
- sequence:
- sequence: int64
- sequence:
数据集分割
- name: academicodec_hifi_16k_320d
- num_bytes: 1360089979.85
- num_examples: 5135
- name: academicodec_hifi_16k_320d_large_uni
- num_bytes: 1360089979.85
- num_examples: 5135
- name: academicodec_hifi_24k_320d
- num_bytes: 1480598203.85
- num_examples: 5135
- name: audiodec_24k_320d
- num_bytes: 1892270875.85
- num_examples: 5135
- name: dac_16k
- num_bytes: 1998532027.85
- num_examples: 5135
- name: dac_24k
- num_bytes: 4630613467.85
- num_examples: 5135
- name: dac_44k
- num_bytes: 2254588147.85
- num_examples: 5135
- name: speech_tokenizer_16k
- num_bytes: 1603057051.85
- num_examples: 5135
数据集大小
- download_size: 9300844417
- dataset_size: 16579839734.800001



