five

Supporting_Information_Files.

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Supporting_Information_Files_/30056808
下载链接
链接失效反馈
官方服务:
资源简介:
Malware classification is a challenging task due to the constantly evolving nature of malicious software. Traditional signature-based methods and static analysis often fail to detect sophisticated threats, making behavior-based analysis crucial. This study proposes a malware detection model that analyzes the behavior of executable files (.exe) to classify them as malware. The model submits the file to VirusTotal, where it runs in a secure environment to monitor actions such as file modifications, registry changes, or network connections. To enhance detection accuracy, the BERT model is applied to extract key features from these behavior logs. After 100 training epochs, the model achieved 92.25% accuracy and an F1-score of 91.22%, demonstrating strong overall performance. Class-wise evaluation was also conducted, treating each malware family as a distinct class to assess specific detection accuracy. Furthermore, a correlation matrix was analyzed to explore inter-class relationships and identify overlapping behaviors. Experimental results show that SVM achieved the highest F1-Scores for Adware (0.98) and BackDoor (0.91), while Random Forest showed comparable performance. Naïve Bayes, however, performed poorly for FakeAlert (F1-Score: 0.64). These findings confirm the effectiveness of the proposed behavior-based approach using BERT features, with SVM and Random Forest proving to be the most reliable classifiers.
创建时间:
2025-09-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作