skymizer/Llama3.1-8B-base-tokenized-fineweb-edu-test-4K
收藏Hugging Face2025-01-12 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/skymizer/Llama3.1-8B-base-tokenized-fineweb-edu-test-4K
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个测试集,包含input_ids、attention_mask、labels、position_ids和length五个字段。input_ids、attention_mask、labels和position_ids字段是序列类型,分别存储整型、8位整型、64位整型和64位整型数据。length字段是非序列类型,存储64位整型数据。测试集共有4537个样本,总文件大小为93216882字节。
This dataset is a test set containing five fields: input_ids, attention_mask, labels, position_ids, and length. The input_ids, attention_mask, labels, and position_ids fields are sequence types, storing integer, 8-bit integer, 64-bit integer, and 64-bit integer data, respectively. The length field is a non-sequence type, storing 64-bit integer data. The test set has a total of 4,537 samples with a total file size of 93,216,882 bytes.
提供机构:
skymizer



