KrzTyb/fim-dataset
收藏Hugging Face2025-10-17 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/KrzTyb/fim-dataset
下载链接
链接失效反馈官方服务:
资源简介:
fim-dataset是一个用于训练代码自动补全模型的代码片段数据集,采用Fill-in-the-Middle (FIM)方法。数据集包含格式化的代码片段,其中包括前缀、后缀和需要完成的代码部分,使用特殊令牌进行标识。数据集分为训练集和验证集,分别包含42,922和2,259个示例。
The fim-dataset is a dataset of code snippets for training code autocompletion models using the Fill-in-the-Middle (FIM) approach. It contains formatted code snippets with a prefix, suffix, and the part of the code to be completed, identified by special tokens. The dataset is split into a training set and a validation set, containing 42,922 and 2,259 examples respectively.
提供机构:
KrzTyb



