CoNaLa (CMU CoNaLa, the Code/Natural Language Challenge)

Name: CoNaLa (CMU CoNaLa, the Code/Natural Language Challenge)
Creator: OpenDataLab
Published: 2026-05-24 05:30:03
License: 暂无描述

OpenDataLab2026-05-24 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/CoNaLa

下载链接

链接失效反馈

官方服务：

资源简介：

CMU CoNaLa，代码/自然语言挑战数据集是卡内基梅隆大学 NeuLab 和 Strudel 实验室的联合项目。其目的是测试从自然语言生成代码片段。数据来自 StackOverflow 问题。有 2379 个训练和 500 个测试示例是手动注释的。每个示例都有一个自然语言意图及其相应的 Python 片段。除了手动标注的数据集外，还有 598,237 个挖掘的意图-片段对。这些示例类似于手动注释的示例，只是它们包含一个概率，如果该对是有效的。

CMU CoNaLa, a code/natural language challenge dataset, is a joint project by Carnegie Mellon University's NeuLab and Strudel Lab. Its core objective is to evaluate code snippet generation from natural language. The dataset's data originates from StackOverflow questions. There are 2,379 manually annotated training examples and 500 manually annotated test examples. Each example consists of a natural language intent and its corresponding Python code snippet. In addition to the manually annotated dataset, there are 598,237 mined intent-snippet pairs. These pairs are analogous to the manually annotated ones, with each pair containing a probability score indicating its validity.

提供机构：

OpenDataLab

创建时间：

2022-05-23

搜集汇总

数据集介绍