five

LAIL

收藏
DataCite Commons2024-07-30 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/LAIL/22014596/1
下载链接
链接失效反馈
官方服务:
资源简介:
<pre># LAIL<br><br>LAIL is a Large language model-Aware selection approach for In-context-Learning-based code generation named LAIL. LAIL uses LLMs themselves to select examples. It requires LLMs themselves to label a candidate example as a positive example or a negative example for a requirement.<br><br><br>## Requirements<br>- openai<br>- tqdm<br>- java<br><br>We also privide a scripts (``/Evaluation/evaluation_setup.sh``) to help set up programming language dependencies that are used in evaluation.<br>```bash<br>bash evaluation_setup.sh<br>```<br><br>###### Dataset<br>The datasets contain DevEval, MBJP, MBPP, MBCPP, and HumanEval. <br><br>DevEval is a repository-level code generation dataset, which is collected from real-word code repositories. The dataset aligns with real-world code repositories in multiple dimensions. Thus, we take DevEval as the example to demonstrate how to process the dataset. <br><br>Take `../Dataset/DevEval` as example:<br>`train.jsonl` and `test.jsonl`:<br>(1) We randomly select two domains to evaluate LAIL and baselines, including the scientific engineering domain and text processing domain. <br>(2) We randomly split the tasks of the two domains into the training set and the test set. Finally, we acquire 101 examples in the training set and 49 examples in the test set. <br>(3) Given a requirement from a repository, we use tree-sitter to parse the repository and acquire all functions of the repository. <br>(4) We treat functions contained in the repository as the candidate pool. Then LAIL and baselines retrieve a few functions from the<br>candidate pool as demonstration examples. <br><br>`source data` and `test_source data` folders consist of the original code repositories collected from Github.<br><br>`estimate_prompt` folder contain the constructed prompts to estimate candidate examples.<br><br>`generation_prompt` folder contains the constructed prompts where the demonstration examples are selected by LAIL and different baselines. For example:<br>(1) `ICL_LAIL` folder provides the selected examples' id in `LAIL_id` by our LAIL. Developers can directly use these provided prompts through `codellama_completion.py` to generate programs. <br>(2) After generating programs, developers need to process generated programs with `process_generation.py`. <br>(3) Finally, developers evaluate the generated programs with the source code in `Evaluation` folder.<br><br>######<br><br><br><br>###### LAIL <br><br>### Estimate candidate examples by LLMs themselves<br><br>We leverage LLM themseleves to estimate candidate examples. The code is storaged in the `LAIL/estimate_examples` package.<br><br>Take `DevEval` as example:<br>(1) `/Dataset/DevEval/estimate_prompt` folder contains the constructed prompts to estimate candidate examples.<br>(2) Developers run the following command to estimate candidate examples by CodeLlama-7B:<br>```bash<br>bash make_estimation_prompt.sh ../Dataset/DevEval/estimation_prompt<br>```<br>(3) According to the probability feedback of LLMs, we acquire the positive and negative examples.<br>###<br><br><br>### Train a neural retriever<br><br>(1) We use the labeled positive and negative examples to train a neural retriever with contrastive learning. <br>The code is storaged in the `/LAIL/LAIL/retriever/train` folder.<br><br><br>```bash<br>export CUDA_VISIBLE_DEVICES=0<br>nohup python run.py \<br> --output_dir=/saved_models \<br> --model_type=roberta \<br> --config_name=microsoft/graphcodebert-base \<br> --model_name_or_path=microsoft/graphcodebert-base \<br> --tokenizer_name=microsoft/graphcodebert-base \<br> --do_train \<br> --train_data_file=/id.jsonl \<br> --epoch 100 \<br> --block_size 128 \<br> --train_batch_size 16 \<br> --learning_rate 1e-4 \<br> --max_grad_norm 1.0 \<br> --seed 123456 &gt;mbpp.txt 2&gt;&amp;1 &amp;<br>```<br><br>## Select a few demonstration examples using the trained retriever<br><br>(2) Given a test requirement, developers use the trained retriever to select a few demonstration examples.<br>The code is storaged in the `/LAIL/LAIL/retriever/train` folder.<br><br>```bash<br>bash run_inference.sh ../Dataset/DevEval<br>```<br><br>###<br><br><br><br>### Code Generation<br><br>(1) After acquired the prompt context consisting of a few selected examples, developers input a test requirement and the prompt context into LLMs and acquire desired programs.<br>For example, developers use CodeLlama ( `../LAIL/ICL_LAIL/codellama_completion.py`) to generate programs:<br>```bash<br>export CUDA_VISIBLE_DEVICES=0<br>torchrun --nproc_per_node=1 --master_port=16665 codellama_completion.py Salesforce/CodeLlama-7b ../Dataset/DevEval/prompt_LAIL.jsonl --temperature=0.8 --max_batch_size=4 --output_base=output_random --get_logits=False <br>```<br><br>(2) After generating programs, developers need to process generated programs with `../LAIL/ICL_LAIL/process_generation.py`. <br>```bash<br>python process_generation.py<br>```<br>###<br><br><br><br>### Baselines<br><br>This paper contains seven baselines that use different approaches to select demonstration examples for ICL_based code generation.<br><br>(1) The source code is in the `baselines` folder and each baseline is in a individual folder.<br>Developers can acquire the selected examples of all baselines by runing source code as follows:<br>```bash<br>python baselines.py<br>```<br><br>(2) Then, developers use `/baselines/make_prompt.py` to contruct a prompt context using the selected candidate examples as follows:<br>```bash<br>python make_prompt.py ICLCoder ICLCoder -1<br>```<br>###<br><br><br><br><br>### Evaluation<br>In this paper, we use Pass@k to evaluate the performances of LAIL and baselines by the source code in `LAIL/Evaluation`<br><br>Since the DevEval dataset is a repository-level code generation which is complex to evaluate, developers can use the following pipeline to evaluate different approaches by the source code in `/LAIL/Evaluation/`.<br><br>## Citation<br>If you have any questions or suggestions, please email us at `lijiaa@pku.edu.cn`.<br><br></pre>
提供机构:
figshare
创建时间:
2024-07-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作