CodeQueries

Name: CodeQueries
Creator: 印度理工学院
Published: 2023-07-14 19:01:45
License: 暂无描述

arXiv2023-07-14 更新2024-06-21 收录

下载链接：

https://github.com/thepurpleowl/codequerie-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

CodeQueries是一个专为测试神经模型理解代码语义能力而设计的数据集，由印度理工学院创建。该数据集包含133456条关于Python代码的语义查询，涵盖了代码的正确性、可靠性、可维护性和安全性等多个方面。数据集的创建基于广泛使用的静态分析工具CodeQL，并包括正例和负例，以及单跳和多跳推理的查询。CodeQueries旨在通过提取式问答设置，评估神经模型在理解代码语义方面的能力，特别是在需要复杂推理的情况下。

CodeQueries is a dataset designed to test the code semantic understanding capabilities of neural models, developed by the Indian Institutes of Technology. It contains 133,456 semantic queries related to Python code, covering multiple aspects including code correctness, reliability, maintainability, and security. The dataset is constructed based on CodeQL, a widely used static analysis tool, and includes both positive and negative examples, as well as queries requiring single-hop and multi-hop reasoning. CodeQueries aims to evaluate the ability of neural models to understand code semantics through the extractive question answering setting, especially in scenarios demanding complex reasoning.

提供机构：

印度理工学院

创建时间：

2022-09-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集