Alignment-Lab-AI/wikitext-2-raw-bytepair
收藏Hugging Face2024-12-08 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Alignment-Lab-AI/wikitext-2-raw-bytepair
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含bytepair_ids特征,这是一个int64类型的序列。数据集被划分为测试集、训练集和验证集三个部分,分别包含4358、36718和3760个样本,对应的字节大小分别为10295552、87275496和9149336字节。数据集的下载大小为16150551字节,总大小为106720384字节。数据文件的路径配置为:测试集数据文件路径为data/test-*,训练集数据文件路径为data/train-*,验证集数据文件路径为data/validation-*。
The dataset contains the feature bytepair_ids, which is a sequence of int64. The dataset is divided into three parts: test, train, and validation, containing 4358, 36718, and 3760 samples respectively, with corresponding byte sizes of 10295552, 87275496, and 9149336 bytes. The download size of the dataset is 16150551 bytes, and the total size is 106720384 bytes. The path configuration for the data files is: test data file path is data/test-*, train data file path is data/train-*, and validation data file path is data/validation-*.
提供机构:
Alignment-Lab-AI



