OpenSpeechHub/Alhaitham-tokenized

Name: OpenSpeechHub/Alhaitham-tokenized
Creator: OpenSpeechHub
Published: 2025-08-15 09:21:45
License: 暂无描述

Hugging Face2025-08-15 更新2025-09-13 收录

下载链接：

https://hf-mirror.com/datasets/OpenSpeechHub/Alhaitham-tokenized

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个包含输入ID序列、标签序列和注意力掩码序列的数据集，主要用于训练机器学习模型。数据集分为训练集，共有945个样本，大小为6892916字节。

This dataset includes sequences of input IDs, label sequences, and attention masks, primarily used for training machine learning models. The dataset is split into a training set with a total of 945 samples and a size of 6892916 bytes.

提供机构：

OpenSpeechHub

5,000+

优质数据集

54 个

任务类型

进入经典数据集