mhonsel/edu_fineweb10B_tokens
收藏Hugging Face2025-01-02 更新2025-08-30 收录
下载链接:
https://hf-mirror.com/datasets/mhonsel/edu_fineweb10B_tokens
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: FineWeb-Edu sample-10BT GPT-2 tokens
size_categories:
- 1B<n<10B
---
This dataset consists of the FineWeb-Edu sample version (sample-10BT), converted to GPT-2 tokens. The 10B tokens are split up between the 100 npy files.
https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu
提供机构:
mhonsel



