AdoCleanCode/commonvoice_en_mfa_correct_train_v1_preview

Name: AdoCleanCode/commonvoice_en_mfa_correct_train_v1_preview
Creator: AdoCleanCode
Published: 2025-12-17 19:54:56
License: 暂无描述

Hugging Face2025-12-17 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/AdoCleanCode/commonvoice_en_mfa_correct_train_v1_preview

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含多个音频和文本特征，用于音频处理和语音分析。具体包括样本索引、原始文件名、音素、左右声道音频、移除的音频、完整音频、完整标记音频、音频持续时间、转录文本、移除的单词、音素注释和实际音素等。数据集的一个分割批次（batch_001）包含100个样本，总大小为49143978字节，下载大小为45277747字节。

The dataset contains multiple audio and text features for audio processing and speech analysis. Specifically, it includes sample index, original filename, phonemes, left and right channel audio, removed audio, full audio, full tokens audio, audio durations, transcription text, removed words, phonemes annotations, and actual phonemes. One split of the dataset (batch_001) contains 100 samples, with a total size of 49143978 bytes and a download size of 45277747 bytes.

提供机构：

AdoCleanCode

5,000+

优质数据集

54 个

任务类型

进入经典数据集