RansomSet - A Dataset for Ransomware Detection & Analysis
收藏IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/ransomset-dataset-ransomware-detection-analysis
下载链接
链接失效反馈官方服务:
资源简介:
Ransomware poses a significant threat to individuals, corporations, and governments by encrypting or locking critical data and demanding payment for its recovery. While machine learning-based solutions show promise for ransomware detection, their effectiveness is limited by the lack of robust, publicly available datasets reflecting modern threats. In this work, we introduce RansomSet, a new multiclass ransomware dataset designed to facilitate the development and benchmarking of such models. RansomSet includes detailed behavioral data, specifically system call sequences, captured from six prominent ransomware families (Conti, Ryuk, CryptoLocker, LockBit, Sodinokibi, WannaCry) alongside benign software samples, resulting in a set of 240 features. Data was generated through repeated dynamic analysis (30 executions per sample) using the Cuckoo sandbox. We employed Exploratory Data Analysis (EDA) to characterize the dataset's properties, including inherent class imbalances reflecting ransomware operational intensity, and eXplainable Artificial Intelligence (XAI) techniques, specifically SHapley Additive exPlanations (SHAP), to provide interpretability. Experimental evaluation using XGBoost demonstrates high classification performance (F1-Score 0.99). Our findings offer interpretable insights into distinct system call patterns, identifying Sodinokibi as the most challenging family to detect. RansomSet is publicly available and serves as a useful resource for advancing research in behavior-based ransomware detection.
提供机构:
Rodrigo Miani; Silvio Quincozes; Gabriel Oliveira; Juliano Kazienko; Gabriel Foletto; Vagner Quincozes



