Simulink Mutation Testing using Large Language Models Item
收藏DataCite Commons2024-06-10 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/Simulink_Mutation_Testing_using_Large_Language_Models_Item/26004637/1
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the code and datasets used in the paper "Simulink Mutation Testing using Large Language Models".<br><br>## Abstract<br>We present BERTiMuS, an approach that uses a large language model, CodeBERT, to generate mutants for Simulink models. BERTiMuS converts Simulink models into textual representations, masks tokens from the derived text, and uses CodeBERT to predict the masked tokens. Simulink mutants are obtained by replacing the masked tokens with predictions from CodeBERT. We evaluate BERTiMuS using Simulink models from an industrial benchmark, and compare it with FIM -- a state-of-the-art tool that relies on fault patterns to generate mutants. We show that, relying exclusively on CodeBERT, BERTiMuS can generate mutants that cover the Simulink mutation patterns documented in the literature for individual blocks. Our results indicate that: (a) BERTiMuS is complementary to FIM, and (b) when one considers a requirements-aware notion of mutation testing, BERTiMuS outperforms FIM.<br><br><br>## Folder Structure<br>```bash├── pre-training│ ├── src│ ├── 0.pre-process.py # Get XML files of all Simulink models.│ ├── 1.parse # Extract the block information inside each XML file.│ ├── 2.format_train_data # Format the data to get Simulink corpus for pre-training.│ ├── 3.MLM_training # Pre-train CodeBERT on Simulink corpus to get new model.│ ├── data│ ├── raw_data # 2,611 Simulink models│ ├── processed_data # XML files extracted from Simulink models│ ├── output # extract block information from XML files│ ├── mlm_corpus # collect all data from output to form corpus for pre-training<br>├── experiments│ ├── src│ ├── 1.mask-mutate # using CodeBERT to mask the tokens of Simulink text and generate predictions as mutations│ ├── 2.inject faluts # inject faults into Simulink to get mutants│ ├── 3.mutation testing # mutation testing on BERTiMuS and FIM mutants│ ├── 4.visualization # visualization code to draw pie charts│ ├── data│ ├── mask-mutate│ ├── mutating_simu_models # XML format file of Simulink models to be mutated│ ├── processed_mutation_data # textual format of Simulink models to be mutated│ ├── mutation_output # predicted mutation results│ ├── saved_model # Checkpoint of CodeBERT pre-trained on Simulink corpus│ ├── results│ ├──subject models # original five Simulink models used for mutation│ ├── mutant models # Mutants of the five models of both BERTiMuS mutants and FIM mutants│ ├── mutation predictions # information of each mutant: block ID, original property value, mutated value│ ├──mutation testing results # mutation testing results of FIM and BERTiMuS mutants<br>└── supplementary_materials```<br>- `pre-training`: Code and data for pretraining CodeBERT on Simulink corpus.- `experiments`: Code for generating mutants, doing mutation testing and visualization.- `supplementary_materials`: Containing additional information of our approach.<br>## All data and results<br>Due to size limit, we put all data, models and outputs in [this link](https://drive.google.com/file/d/179SQPOguBYUbkRM76NkgXqVqBQp5-KKI/view).<br>## Getting Started<br>To reproduce the results in the paper, follow these steps:<br>1. Clone the repository.2. Install the required dependencies by running `pip install -r requirements.txt`.3. Run the scripts in the following order:1. In Python compiler, cd/pre-training/src, ```bash scripts/pretrain.sh```2. In Python compiler, cd/experiments/src, ```bash scripts/mask-mutate.sh```3. In Matlab compiler, cd/experiments/src/2.inject faluts, for each model, ```run m_update_block.m```4. In Matlab compiler, cd/experiments/src/3.mutation testing, cd each model, ```run run_all.m```<br>## Citation<br>If you use this code or the datasets in your research, please cite the following paper:<br>```@inproceedings{paper,title = {Mutation Testing of Simulink Models using Large Language Models},author = {Anonymous},booktitle = {???},year = {2024},}```
提供机构:
figshare
创建时间:
2024-06-10



