bishtp/code-search-net-tokenized-dataset
收藏Hugging Face2025-08-21 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/bishtp/code-search-net-tokenized-dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了一个名为input_ids的整数序列特征,分为训练集和验证集两个部分。训练集包含1,360,941个示例,大小为702,245,556字节;验证集包含13,171个示例,大小为6,796,236字节。整个数据集的大小为709,041,792字节,下载大小为311,461,402字节。
The dataset includes a feature named input_ids consisting of integer sequences, split into two parts: a training set and a validation set. The training set contains 1,360,941 examples with a size of 702,245,556 bytes, and the validation set contains 13,171 examples with a size of 6,796,236 bytes. The entire dataset is 709,041,792 bytes in size, with a download size of 311,461,402 bytes.
提供机构:
bishtp



