anilkeshwani/GigaSpeech_aligned_hubert
收藏Hugging Face2024-08-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/anilkeshwani/GigaSpeech_aligned_hubert
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个特征,如segment_id、text、text_processed、audio_id、path、speaker、begin_time、end_time、title、url、source、category、original_full_path、tokenized、normalized、uroman、speech_tokens、aligned_token_start_time和aligned_token_end_time。这些特征涵盖了文本、音频、时间戳、来源、类别等信息。数据集分为一个训练集,包含2,266,371个样本,总大小为6,154,412,303字节。
This dataset contains multiple features such as segment_id, text, text_processed, audio_id, path, speaker, begin_time, end_time, title, url, source, category, original_full_path, tokenized, normalized, uroman, speech_tokens, aligned_token_start_time, and aligned_token_end_time. These features cover text, audio, timestamps, sources, categories, and more. The dataset is divided into a training set containing 2,266,371 samples with a total size of 6,154,412,303 bytes.
提供机构:
anilkeshwani



