five

Ba2han/1202-tokenized-1

收藏
Hugging Face2026-02-12 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Ba2han/1202-tokenized-1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en - tr tags: - unsloth - llama-3.2 - tokenized --- # Dataset Statistics **Tokenizer:** `unsloth/Llama-3.2-1B` **Total Files:** 25 **Grand Total Tokens:** 1,399,855,102 ## File Breakdown | File Name | Total Tokens | Avg Tokens | Median Tokens | |:-----------------------|:---------------|-------------:|----------------:| | processed_0000.parquet | 57,139,999 | 571.4 | 490 | | processed_0001.parquet | 57,058,976 | 570.59 | 489 | | processed_0002.parquet | 57,186,019 | 571.86 | 489 | | processed_0003.parquet | 57,244,404 | 572.44 | 489 | | processed_0004.parquet | 57,115,946 | 571.16 | 489 | | processed_0005.parquet | 57,169,206 | 571.69 | 490 | | processed_0006.parquet | 57,077,286 | 570.77 | 490 | | processed_0007.parquet | 57,056,390 | 570.56 | 489 | | processed_0008.parquet | 56,930,623 | 569.31 | 488 | | processed_0009.parquet | 57,115,109 | 571.15 | 490 | | processed_0010.parquet | 56,872,412 | 568.72 | 489 | | processed_0011.parquet | 57,128,465 | 571.28 | 490 | | processed_0012.parquet | 57,102,270 | 571.02 | 491 | | processed_0013.parquet | 57,256,564 | 572.57 | 491 | | processed_0014.parquet | 57,140,115 | 571.4 | 489 | | processed_0015.parquet | 57,065,110 | 570.65 | 489 | | processed_0016.parquet | 57,133,524 | 571.34 | 490 | | processed_0017.parquet | 57,065,756 | 570.66 | 490 | | processed_0018.parquet | 57,090,426 | 570.9 | 490 | | processed_0019.parquet | 56,987,728 | 569.88 | 489 | | processed_0020.parquet | 57,101,876 | 571.02 | 489 | | processed_0021.parquet | 57,115,846 | 571.16 | 490 | | processed_0022.parquet | 57,312,919 | 573.13 | 490 | | processed_0023.parquet | 57,078,317 | 570.78 | 490 | | processed_0024.parquet | 29,309,816 | 569.72 | 489 |
提供机构:
Ba2han
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作