five

CISA TTP Articles Data Set

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14659511
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains 77 cybersecurity articles crawled from the public CISA website. All these articles were publically available at the time of crawling without the need of any subscription or usage of paid services. These articles were published from July 2020 to February 2024 and selected for this dataset if they contained explicitely mentioned MITRE ATT&CK TTPs (Tactics, Techniques, and Procedures). The data set supports research in the domain of Cyber Threat Intelligence as it may act as a ground truth for TTP labeling. Specifically, this dataset is designed to facilitate research and analysis related to the identification and classification of TTPs in cybersecurity advisories. Each crawled article is represented by the following four columns: RawText: The unfiltered text extracted from the main content of each article (class: "l-full__main"). TTP: A set of MITRE ATT&CK TTP (Tactics, Techniques, and Procedures) IDs identified within the article's RawText. These IDs are extracted using the regex pattern: (?:TA\d{4}|T\d{4,5}(?:\.\d{3})?). CleanText: A cleaned version of the RawText, with tables and TTP IDs removed for clarity. URL: The url to the original article. About the crawling process  All advisories were gathered on Sept 27th, 2024 from the CISA website by sifting through all advisory urls backwards in time until 2020. All articles which explicitely mentioned TTPs were selected for the data set. To detect the presence of TTP IDs, each article was checked for the presence of any of the following phrases in the main content: "MITRE ATT&CK Tactics and Techniques" "Tactics and Techniques" "MITRE ATT&CK Techniques" The data set is availble in CSV and as JSON format, both containing the same data.
创建时间:
2025-01-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作