bluelightai-dev/clt-mixed-eval-data
收藏Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/bluelightai-dev/clt-mixed-eval-data
下载链接
链接失效反馈官方服务:
资源简介:
# Mixed Dataset Summary
Generated on 2025-12-08 23:04:52 UTC.
- Total samples: 60,000
- Train samples: 56,999
- Validation samples: 3,001
- Train fraction: 0.95
- Shuffle seed: 85028
| Source | Dataset ID | Samples |
| --- | --- | ---: |
| pretrain | bluelightai-dev/clt-pretrain-data-v2-dedup | 40,000 |
| posttrain | bluelightai-dev/clt_posttrain_data_tokenized | 20,000 |
提供机构:
bluelightai-dev



