morphpiece/smollm-combined_smollm135_grouped_2048
收藏Hugging Face2024-10-21 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/morphpiece/smollm-combined_smollm135_grouped_2048
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含大量的训练数据,具体特征包括input_ids(序列类型,uint16)和source(字符串类型)。数据集被分割为训练集,包含191,823,359个样本,总大小为788,533,645,190字节。下载大小为710,886,712,440字节。
This dataset contains a large amount of training data, with features including input_ids (sequence type, uint16) and source (string type). The dataset is split into a training set, containing 191,823,359 samples, with a total size of 788,533,645,190 bytes. The download size is 710,886,712,440 bytes.
提供机构:
morphpiece



