kanishka/babylm2-rewritten-clean_hierarchical-adj_size-origin_adj1-ablation
收藏Hugging Face2025-10-25 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/kanishka/babylm2-rewritten-clean_hierarchical-adj_size-origin_adj1-ablation
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本数据,分为训练集和验证集两部分。训练集有11998717个样本,验证集有1223928个样本。数据集总大小为577583910字节,下载大小为353455293字节。
The dataset contains text data, split into a training set and a validation set. The training set has 11,998,717 samples, and the validation set has 1,223,928 samples. The total size of the dataset is 577,583,910 bytes, with a download size of 353,455,293 bytes.
提供机构:
kanishka



