felixgiov/ja_html_nii_01-06_llama_tokenized
收藏Hugging Face2025-02-04 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/felixgiov/ja_html_nii_01-06_llama_tokenized
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了一系列整数序列作为特征(input_ids),并分为训练集。训练集包含122568个示例,数据集总大小为502,528,800字节。数据集的下载大小为225,324,998字节。具体的应用场景和数据来源在README中未提及。
The dataset consists of a series of integer sequences as features (input_ids) and is split into a training set. The training set contains 122,568 examples, and the total size of the dataset is 502,528,800 bytes. The download size of the dataset is 225,324,998 bytes. The specific application scenario and data source are not mentioned in the README.
提供机构:
felixgiov



