kanishka/babylm2-rewritten-clean-spacy_ablate_both_lenient
收藏Hugging Face2025-09-11 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/kanishka/babylm2-rewritten-clean-spacy_ablate_both_lenient
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本特征,适用于机器学习模型的训练和验证。训练集包含11,780,004个示例,大小为518,565,881字节;验证集包含1,223,928个示例,大小为57,764,912字节。整个数据集的大小为576,330,793字节,下载大小为340,676,704字节。
The dataset includes text features and is suitable for training and validation of machine learning models. The training set contains 11,780,004 examples, with a size of 518,565,881 bytes; the validation set contains 1,223,928 examples, with a size of 57,764,912 bytes. The total size of the dataset is 576,330,793 bytes, and the download size is 340,676,704 bytes.
提供机构:
kanishka



