five

AOCG

收藏
DataCite Commons2023-12-07 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/AOCG/24763701
下载链接
链接失效反馈
官方服务:
资源简介:
<pre>### The replication package of AOCG<br><br>The repository is divided into two parts: datasets and the code of our AOCG method.<br><br><br>### Requirements<br>```<br>- python 3.8<br>- Java 1.8.0<br>- transformers 4.5.1<br>- tree-sitter 0.2.2<br>- Pytorch 1.7.1<br>```<br><br>### Data Preprocessing<br>```<br>Experimental datasets contain the API_SUM dataset, the Hearthstone dataset, and the MBPP dataset. We use tree sitter tool to automatically extract the API terms and sketches of programs.<br><br>Take the MBPP dataset as an example:<br><br>To extract API terms, run 'data_process/api_extract.py' and acquire the 'api_terms.jsonl' <br><br>To extract sketches, run 'data_process/sketch_extract.py' and acquire the 'sketches.jsonl' <br><br>Put the API terms, sketches, complete codes, and requirements into the 'final_train.jsonl' and 'final_test.jsonl'.<br><br>```<br><br><br>### Training<br>Given a specific requirement, the APIer predicts API terms, and the Sketcher outputs corresponding the sketch based on the API terms and requirements. And the Coder fills the sketch to a complete program according to the API terms, sketch and requirement.<br><br>```<br>export CUDA_VISIBLE_DEVICES=0<br><br>python AOCG_finetune.py \<br>--stage_1 nl_pp \<br>--stage_2 nl_pp_ss \<br>--stage_3 nl_ss_pp_code \<br>--local_rank -1 <br><br>```<br><br><br>### Inference<br>The AOCG predicts code snippets in a progressive generation manner, and write the predicted codes into 'xx.output'.<br><br>```<br>export CUDA_VISIBLE_DEVICES=0<br><br>python AOCG_inference.py \<br>--stage_1 nl_pp \<br>--stage_2 nl_pp_ss \<br>--stage_3 nl_ss_pp_code \<br>--local_rank -1 <br><br>```<br><br><br>### Evaluation<br>```<br>After acquiring the generated codes, evaluate the programs by running 'evaluator/evaluation.py'.<br><br>```</pre>
提供机构:
figshare
创建时间:
2023-12-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作