CodeComplex
收藏arXiv2024-01-16 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2401.08719v1
下载链接
链接失效反馈官方服务:
资源简介:
CodeComplex是由延世大学、江原国立大学和首尔大学联合创建的一个大规模源代码数据集,专注于预测代码的时间复杂度。该数据集包含4,900个Java代码和4,900个Python代码,总计9,800个代码,均来自编程竞赛,并由算法专家手动标注了复杂度标签。数据集的创建过程涉及从Codeforces平台筛选正确的代码,并通过详细的分析和专家投票进行复杂度标注。CodeComplex的应用领域主要集中在教育和优化算法效率,旨在通过深度学习模型准确预测代码的时间复杂度。
CodeComplex is a large-scale source code dataset jointly developed by Yonsei University, Gangwon National University, and Seoul National University, dedicated to predicting code time complexity. This dataset includes 4,900 Java code samples and 4,900 Python code samples, with a total of 9,800 code pieces, all sourced from programming contests. The complexity labels for these codes were manually annotated by algorithm experts. The dataset construction process involves screening valid code submissions from the Codeforces platform, and conducting complexity annotation via detailed analysis and expert voting. The primary application areas of CodeComplex are education and algorithm efficiency optimization, with the core goal of accurately predicting code time complexity using deep learning models.
提供机构:
延世大学, 韩国 †江原国立大学, 韩国 ‡首尔大学, 韩国
创建时间:
2024-01-16
搜集汇总
数据集介绍

背景与挑战
背景概述
CodeComplex是一个专注于预测代码时间复杂度的数据集,包含9,800个来自编程竞赛的Java和Python代码样本,均由专家手动标注复杂度标签,主要用于教育和优化算法效率的研究。
以上内容由遇见数据集搜集并总结生成



