tumeteor/Security-TTP-Mapping
收藏数据集概述
基本信息
- 许可证: cc
- 任务类别:
- 文本分类
- 问答
- 零样本分类
- 句子相似度
- 语言: 英语
- 标签:
- 安全
- TTP 映射
- MITRE ATT&CK
- 极端多标签
- 多标签分类
- 名称: Security Attack Pattern Recognition Datasets
- 大小类别: 1K<n<10K
数据集描述
- 任务: 安全攻击模式(TTP)识别或映射任务
- 内容: 包含关于恶意软件和其他安全方面的文本信息
- 类型: 多标签分类
- 类别数量: 超过600个层次类别
数据集详情
TRAM
- 来源: CTID
- 处理: 从所有来源收集、去除重复、解决噪声/过短文本和噪声标签,重新映射到MITRE ATTACK 12.0
- 分割: 分为训练、验证和测试集
Procedure+
- 子数据集:
- Procedures: 来自MITRE,收集并处理v12.0的所有程序示例,分为训练、验证和测试集
- Derived procedures: 从相关文章中爬取URL引用,提取原始文本并处理,分为训练、验证和测试集
Expert
- 构建: 从大量高质量威胁报告中构建
- 标注: 由资深安全专家精心选择并标注
- 分割: 预分为训练、验证和测试集,测试集中平均每个文本有约4个标签
引用
@inproceedings{nguyen-srndic-neth-ttpm, title = "Noise Contrastive Estimation-based Matching Framework for Low-resource Security Attack Pattern Recognition", author = "Nguyen, Tu and Šrndić, Nedim and Neth, Alexander", booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics", month = mar, year = "2024", publisher = "Association for Computational Linguistics", abstract = "Tactics, Techniques and Procedures (TTPs) represent sophisticated attack patterns in the cybersecurity domain, described encyclopedically in textual knowledge bases. Identifying TTPs in cybersecurity writing, often called TTP mapping, is an important and challenging task. Conventional learning approaches often target the problem in the classical multi-class or multilabel classification setting. This setting hinders the learning ability of the model due to a large number of classes (i.e., TTPs), the inevitable skewness of the label distribution and the complex hierarchical structure of the label space. We formulate the problem in a different learning paradigm, where the assignment of a text to a TTP label is decided by the direct semantic similarity between the two, thus reducing the complexity of competing solely over the large labeling space. To that end, we propose a neural matching architecture with an effective sampling-based learn-to-compare mechanism, facilitating the learning process of the matching model despite constrained resources.", }
许可证
本项目基于Creative Commons CC BY License, version 4.0。




