OpenSpeechHub/Emilia-EN-SmolLM2-135M-Tokenized
收藏Hugging Face2025-06-29 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/OpenSpeechHub/Emilia-EN-SmolLM2-135M-Tokenized
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个包含音频文件相关信息的集合,每个音频文件都有与之对应的文本、语言、说话者等信息。数据集适用于模型训练,提供了input_ids、labels和attention_mask等字段来辅助模型学习。数据集分为训练集,包含了大量的音频文件示例。
This dataset is a collection that includes various information related to audio files, with each audio file accompanied by corresponding text, language, speaker, and other details. The dataset is suitable for model training and provides fields such as input_ids, labels, and attention_mask to assist with learning. The dataset is split into a training set, containing a large number of audio file examples.
提供机构:
OpenSpeechHub



