The Artifact of the ESEC/FSE 2023 Paper Titled "Natural Language to Code: How Far are We?"
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7546357
下载链接
链接失效反馈官方服务:
资源简介:
In this online repository, we release the source code of each of the selected techniques as well as the experiment results from each technique (which are stored in the Results.zip file). For each technique, we also provide our scripts to fine this approach on the CodeSearchNet-Python dataset. For example, finetune.sh/inference.sh are used to finetune/evaluate CodeBERT and they are under "CodeBERT/CodeBERT".
Our evaluation dataset CodeSearchNet is a well-known benchmark and it can be downloaded on its official webpage.
The code to calculate the evaluation metrics are reused from CodeBLEU.
Below is a piece of code generated by CodeT5. In this case, CodeT5 generates a statement recurrently, which leads to the syntactic error. Despite that, the code itself fulfills certain functionalities, and that is why it can achieve a CodeBLEU of 24.9%.
def makeMimiLocal(filename):
try:
with open(filename, 'rb') as f:
data = f.read()
except IOError:
data = b''
data = data.decode('utf-8')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\
We also release the 100 randomly-selected queries as well as the code generated by ChatGPT in the chatGPT.jsonl.
创建时间:
2023-08-18



