stojchet/5-csn_java_python_subset
收藏Hugging Face2024-07-10 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/stojchet/5-csn_java_python_subset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含Java和Python两种编程语言的函数代码信息。每个函数的信息包括其在代码库中的位置、函数名、完整的函数字符串、编程语言、函数代码字符串、函数代码的token序列、函数文档字符串、函数文档的token序列、数据集分割名称以及函数代码的URL。数据集分为训练集,Java配置的训练集包含10507个样本,Python配置的训练集包含10421个样本。
The dataset includes configurations for both Java and Python programming languages, detailing function-related data. Each configuration provides comprehensive function information such as repository name, function path in repository, function name, whole function string, language type, function code string, tokenized function code representation, function documentation string, tokenized documentation representation, split name, and function code URL. The dataset is divided into a training set, with the Java configuration containing 10507 examples and the Python configuration containing 10421 examples.
提供机构:
stojchet



