Raw Python Code Corpus
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3628783
下载链接
链接失效反馈官方服务:
资源简介:
A raw code corpus for the Python programming language i.e., includes only the Python source files of each repository without any preprocessing.
The corpus was used to generate the Python training, validation, testing, and BPE encoding sets for the experiments performed in the paper: Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code.
创建时间:
2020-01-29



