PhishStorm - phishing / legitimate URL dataset
收藏DataCite Commons2025-11-14 更新2024-07-03 收录
下载链接:
https://research.aalto.fi/en/datasets/f49465b2-c68a-4182-9171-075f0ed797d5
下载链接
链接失效反馈官方服务:
资源简介:
URLs dataset with features built and used for evaluation in the paper "PhishStorm: Detecting Phishing with Streaming Analytics" published in IEEE TNSM.
The dataset contains 96,018 URLs: 48,009 legitimate URLs and 48,009 phishing URLs.
This is a CSV file where the "domain" column provides a unique identifier for each entry (which is actually a URL). The "label" column provides the domain entry status, 0: legitimate / 1:phishing.
Other columns provide computed values for features introduced in [1].
Please refer to the following publication when citing this dataset:
[1] S. Marchal, J. Francois, R. State, and T. Engel. PhishStorm: Detecting Phishing with Streaming Analytics. IEEE Transactions on Network and Service Management (TNSM), 11(4):458-471, 2014.
本数据集为发表于IEEE TNSM的论文《PhishStorm:基于流分析的钓鱼检测》(PhishStorm: Detecting Phishing with Streaming Analytics)中构建的、用于模型评估的URL数据集。
本数据集共包含96018条URL,其中合法URL与钓鱼URL各48009条。
本数据集采用逗号分隔值(CSV)格式存储,其中“domain”字段为每条记录(实际为URL)的唯一标识符;“label”字段用于标识该记录的类别:0代表合法URL,1代表钓鱼URL。
其余字段均为文献[1]中提出的特征的计算值。
引用本数据集时,请参考如下文献:
[1] S. Marchal、J. Francois、R. State与T. Engel. PhishStorm:基于流分析的钓鱼检测[J]. IEEE网络与服务管理汇刊(IEEE Transactions on Network and Service Management, TNSM), 2014, 11(4):458-471.
提供机构:
Aalto University
创建时间:
2021-10-27
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个包含钓鱼和合法URL的CSV文件,总计96,018个URL,两类各占一半。数据集提供了域名、标签及多个计算特征值,用于钓鱼检测研究。
以上内容由遇见数据集搜集并总结生成



