Malicious Large Language Models Detection using Metadata Information
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12578530
下载链接
链接失效反馈官方服务:
资源简介:
# Introduction# This is the replication package for the paper "Malicious Large Language Models Detection using Metadata Information".
## Task DefinitionGiven the information of LLM, the task is to identify whether it is a malicious LLM that may attack software systems. We treat the task as binary classification (0/1), where 1 stands for malicious LLMs and 0 for malicious-free LLMs.
## The static of dataset### Data FormatBefore preprocessing dataset, each line in the uncompressed file represents multiple metadata of one large language model (LLM). One row is illustrated below. - **idx:** the index of example - **repo_id:** the id of LLM (e.g., microsoft/codebert-base) - **tags:** the tags of LLM - **pipeline_tag:** the pipeline_tag of LLM - **downloads:** the number of downloads - **created_time:** the created time of LLM - **modelCard:** the text content of LLM - **num_discussion:** the number of discussions - **discussion:** the discussions of LLM - **para_size:** the size of LLM - **tensor_type:** the type of LLM - **num_commit:** the number of commits - **commit:** the commit of LLM
After preprocessing dataset, you can obtain three .csv files, i.e. train.csv, valid.csv, test.csv - **idx:** the index of example. - **repo_id:** the id of LLM (e.g., microsoft/codebert-base). - **tags:** the tags of LLM. - **pipeline_tags:** the pipeline_tag of LLM. - **created_time:** the created time of LLM. - **model_size:** the size of LLM. - **Tensor_type:** the type of LLM. - **is_model_card:** Whether to include model card. If the model has model card, the value is 1, otherwise, the value is 0. - **malicious_model_card:** Whether to include keywords describing the malicious model in model card. If the model card has malicious keywords, the value is 1, otherwise, the value is 0 - **repository_link:** Whether to include repository link: GitHub link, Arxiv link, homepage link, bugs link and issues link in model card. If the model card has link, the value is 1, otherwise, the value is 0. - **dataset_info:** Whether to include the adopted dataset information in model card. If the model card has dataset information, the value is 1, otherwise, the value is 0. - **metrics_info:** Whether to include the evaluated metrics information in model card. If the model card has evaluation metrics information, the value is 1, otherwise, the value is 0. ?- **script_info:** Whether to include script information. If the model has script information, the value is 1, otherwise, the value is 0. - **config_content:** The content of the configuration script file. This value is string type. - **stakeholder_name:** The name of authors, contributors, and maintainers. This value is string type. - **number_discussion:** The number of discussion. - **num_pr:** The number of pull request. - **malicious_discussion:** Whether the discussion contains malicious behavior keywords. If the discussion has malicious behavior keywords, the value is 1, otherwise, the value is 0. - **number_commit:** The number of commit. - **malicious_commit:** Whether the title and message of commits contain malicious behavior keywords. If the commit has malicious behavior keywords, the value is 1, otherwise, the value is 0. - **z_download:** The z-score of number of download. - **z_like:** The z-score of number of likes.
### Data StatisticsData statistics of the dataset are shown in the below table:
| #File Names | #Examples || ------------------- | :------------------------: || dataset_feature.csv | 578,502 (560,257/18,245) | | train_imbalance.csv | 462,801 (448205/14596) || valid_imbalance.csv | 57,849 (56025/1824) || test_imbalance.csv | 57,852 (56027/1825) || train_balance.csv | 29192 (14596/14596) || valid_balance.csv | 3648 (1824/1824) || test_balance.csv | 3650 (1825/1825) || train_imbalance_50.csv | 289252 (280129/9123) || train_imbalance_60.csv | 347101 (336154/10947) || train_imbalance_70.csv | 404952 (392180/12772) |
29646 models contain github linkall_dataset.csv 596383all_dataset_information.csv 589140 (safe_dataset.csv(570549), unsafe_dataset.csv(18591))safe_dataset_information.csv (559582), unsafe_dataset_information.csv (18212)
Description Feature: 'malicious_model_card', 'repository_link', 'dataset_info', 'metrics_info', 'config_conteng'Stakeholder Feature: 'stakeholder_name'Event Feature: 'num_pr', 'number_commit', 'malicious_commit',Context Feature: 'z_download', 'z_like'
## Pipeline-MPTMHunterWe also provide a pipeline that fine-tunes [MPTMHunter](https://doi.org/10.5281/zenodo.12578531) on this task.
### Experimental environment configuration```bashhuggingface_hub 0.23.1libxgboost 2.0.3lightgbm 4.3.0networkx 3.2.1nltk 3.8.1numpy 1.26.3openssl 3.0.13pandas 2.2.1pillow 10.2.0scikit-learn 1.4.2scipy 1.13.0torch 2.3.0+cu118torchaudio 2.3.0+cu118torchvision 0.18.0+cu118tqdm 4.66.2transformers 4.37.2xgboost 2.0.3```
### Dataset Collection Script```bashpython ./script/DataExtraction.ipynbpython ./script/dataset_spider.pypython ./script/config_crawl.py```
### Dataset Preprocess Script```bashpython feature_generation.py --input_file='../dataset/dataset_information.csv' --output_file='../dataset/dataset_feature.csv'
python feature_generation.py --input_file='../dataset/real_world_dataset_information_0701.csv' --output_file='../dataset/real_world_dataset_feature_0701.csv'```
### Model Training Script```bashpython run_codet5_lstm.py --output_dir='../saved_models/codet5_lstm_imbalance_final' --model_type=codet5 --tokenizer_name='../models/codet5' --model_name_or_path='../models/codet5' --do_train --train_data_file='../dataset/train_imbalance_70.csv' --eval_data_file='../dataset/valid_imbalance_70.csv' --test_data_file='../dataset/test_imbalance.csv' --epoch=3 --block_size=510 --train_batch_size=64 --eval_batch_size=64 --learning_rate=2e-5 --max_grad_norm=1.0 --evaluate_during_training --seed=123456```
### Model Inference Script```bashpython run_codet5_lstm.py --output_dir='../saved_models/codet5_lstm_imbalance' --model_type=codet5 --tokenizer_name='../models/codet5' --model_name_or_path='../models/codet5' --do_eval --do_test --train_data_file='../dataset/train_imbalance_70.csv' --eval_data_file='../dataset/valid_imbalance_70.csv' --test_data_file='../dataset/test_imbalance.csv' --epoch=3 --block_size=510 --train_batch_size=64 --eval_batch_size=64 --learning_rate=2e-5 --max_grad_norm=1.0 --evaluate_during_training --seed=123456```
### Evaluation Script```bashpython ../evaluation/evaluation.py -a ../dataset/test_balance.csv -p ../saved_models/codebert_imbalance_all/predictions.txtpython ../evaluation/evaluation.py -a ../dataset/test_imbalance.csv -p ../dataset/predictions.txt```
## ResultThe results on the test set are shown as below (We use the OpenTextClassification as the baseline):
| Methods | ACC | Precision | Recall | F1-Score || Random Forest | 97.18% | 83.80% | 13.04% | 22.57% || LR | 96.80% | 43.46% | 4.55% | 8.23% || LightGBM | 97.62% | 95.18% | 25.97% | 40.81% || TextRNN | 96.86% | 73.33% | 0.60% | 1.20% || TextCNN | 98.89% | 95.61% | 68.00% | 79.47% || TextRCNN | 98.87% | 95.36% | 67.62% | 79.13% || TextRNN_Att | 98.94% | 95.29% | 69.86% | 80.62% || MPTMHunter | ** 99.99% ** | ** 99.95% ** | ** 99.78% ** | ** 99.86% ** |
创建时间:
2024-09-16



