Data for the Paper "Scalable and Generalizable RL Agents for Attack Path Discovery via Continuous Invariant Spaces"

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/14604651

下载链接

链接失效反馈

官方服务：

资源简介：

This repository contains the data used in the paper "Scalable and Generalizable RL Agents for Attack Path Discovery via Continuous Invariant Spaces".This data have been generated using the C-CyberBattleSim framework, an extension of the original Microsoft CyberBattleSim framework.The data includes the data scraped for generating scenarios, the scenarios generated, the training/testing/hyper-optimization results of the GAE, agent models, and the multi-label classifier used to label the vulnerabilities. The data is organized in the following way:- environment_database/: Contains the vulnerabilities scraped from NVD and Shodan regarding services and vulnerabilities used in the simulations.- scenarios/: Contains the scenarios generated using the environment database.- classifiers_data/: Contains the labeled vulnerabilities used for training the multi-label classifier.- gae_hyperopt/: Contains the results of the hyperparameter optimization for the GAE model.- gae_training/: Contains the results of the training & testing of the GAE model.- classifiers_hyperopt/: Contains the results of the hyperparameter optimization for the multi-label classifier model.- classifiers_training_testing/: Contains the results of the training & testing of the multi-label classifier model.- agent_hyperopt/: Contains the results of the hyperparameter optimization for the agent models.- agents_training_testing/: Contains the results of the training & testing of the agent models.- config/: The configuration files used in the study, divided in sub-folders according to the specific sub-study where used. The subfolders within each folder have been renamed to be self-explanatory. For any questions regarding reproducibility, please feel free to contact us. The C-CyberBattleSim tool includes a README file with detailed commands for effectively using this data.

本仓库收录了论文《面向攻击路径发现的可扩展通用强化学习智能体（RL Agents）：基于连续不变空间方法》所使用的数据集。本数据集基于C-CyberBattleSim框架生成，该框架是对原始Microsoft CyberBattleSim框架的扩展。数据集涵盖了场景生成所需的爬取数据、已生成的仿真场景、广义优势估计（GAE）模型的训练/测试/超参数优化结果、智能体模型，以及用于漏洞标注的多标签分类器。数据集的组织形式如下： - environment_database/：存储从国家漏洞数据库（NVD）与Shodan爬取的、与仿真中使用的服务及漏洞相关的漏洞数据。 - scenarios/：存储基于环境数据库生成的仿真场景。 - classifiers_data/：存储用于训练多标签分类器的标注漏洞数据。 - gae_hyperopt/：存储广义优势估计（GAE）模型的超参数优化结果。 - gae_training/：存储广义优势估计（GAE）模型的训练与测试结果。 - classifiers_hyperopt/：存储多标签分类器模型的超参数优化结果。 - classifiers_training_testing/：存储多标签分类器模型的训练与测试结果。 - agent_hyperopt/：存储智能体模型的超参数优化结果。 - agents_training_testing/：存储智能体模型的训练与测试结果。 - config/：存储本研究中使用的配置文件，根据其所属的具体子研究划分为不同子文件夹。各文件夹内的子文件夹均已重命名，含义清晰直观。若您对研究可复现性有任何疑问，欢迎随时联系我们。C-CyberBattleSim工具附带README文件，其中包含有效使用本数据集的详细操作指令。

创建时间：

2025-01-06