KantaHayashiAI/test2
收藏Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/KantaHayashiAI/test2
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: test strict shuffled
task_categories:
- text-classification
tags:
- parquet
- shuffled
- exact-shuffle
---
# Strict shuffled copy of `KantaHayashiAI/test`
This dataset was produced by assigning each source row a deterministic pseudo-random sort key
derived from:
- source parquet path
- row index inside that parquet
- shuffle seed `2026-04-01-strict-shuffle-v1`
Rows were first partitioned into `512` buckets by the high bits of the key,
then each bucket was fully sorted by `(__key_hi, __key_lo, __file_id, __row_idx)`.
This yields a deterministic global shuffled order without requiring the full dataset
to be materialized twice on local disk.
Expected train shard count: `512`.
提供机构:
KantaHayashiAI



