钓鱼网站的数据集
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-13760.html
下载链接
链接失效反馈官方服务:
资源简介:
ISSR CS602机器学习-项目网站钓鱼数据集下载:数据文件夹,数据集描述摘要:|数据集特征:多元|实例数:1353 | |-|-|-|-|属性特征:整数|属性数:10 |-|-|-|-|相关任务:分类|网络点击数:54880 |来源:[数据集url](https://archive.ics.uci.edu/ml/datasets/Website+网络钓鱼(Neda Abdelhamid Auckland Institute of Studies nedah“@”ais)。ac.nz数据集信息:钓鱼问题被认为是一个至关重要的问题?。昏迷€?行业尤其是电子银行和电子商务占据了涉及支付的在线交易的数量。我们已经确定了与合法网站和钓鱼网站相关的不同功能,并从不同来源收集了1353个不同的网站。钓鱼网站是从钓鱼银行数据档案(www.Phishtank.com)收集的,这是一个免费的社区网站,用户可以在这里提交、验证、跟踪和共享钓鱼数据。合法网站是使用PHP开发的web脚本从Yahoo和Start point目录中收集的。PHP脚本通过浏览器插入,我们从1353个网站中收集了548个合法网站。共有702个钓鱼URL和103个可疑URL。当一个网站被认为可疑时,这意味着它可以是钓鱼网站,也可以是合法网站,这意味着该网站拥有一些合法和钓鱼功能。属性信息:URL定位请求URL SFH URL长度为€@a€?前缀/后缀IP子域Web流量域年龄级收集的功能包含分类值,a€?合法a€?,a€?可疑美国€?和一欧元?菲希亚€?,这些值已分别替换为数值1、0和-1。下面提到的研究论文中提到了每个功能的详细信息
ISSR CS602 Machine Learning - Phishing Website Dataset Download: Data Folder, Dataset Description Summary:
| Dataset Feature: Multivariate | Number of Instances: 1353 |
| Attribute Type: Integer | Number of Attributes: 10 |
| Related Task: Classification | Number of Web Clicks: 54880 |
Source: [Dataset URL](https://archive.ics.uci.edu/ml/datasets/Website+Phishing) (collected by Neda Abdelhamid, Auckland Institute of Studies; email: nedah@ais.ac.nz)
Dataset Information: Phishing has been identified as a critical cybersecurity issue. Especially in the e-banking and e-commerce sectors, the volume of online payment transactions is considerable. We have identified distinct features associated with legitimate and phishing websites, and collected 1,353 unique websites from multiple sources. Phishing websites were collected from PhishTank (www.phishtank.com), a free community platform where users can submit, verify, track, and share phishing-related data. Legitimate websites were collected from Yahoo and Startpoint directories using a PHP-developed web script. The script was executed via a browser, and we gathered 548 legitimate websites among the 1,353 total websites. In total, there are 702 phishing URLs and 103 suspicious URLs. A website is classified as suspicious when it exhibits characteristics of both legitimate and phishing websites, meaning it could be either a legitimate or phishing site.
Attribute Information: The collected features include URL-related attributes such as Request URL, SFH, URL length, prefix/suffix, IP address, subdomain, web traffic, and domain age. The categorical labels ("legitimate", "suspicious", "phishing") were replaced with numerical values 1, 0, and -1 respectively. Detailed descriptions of each feature are provided in the referenced research paper.
提供机构:
帕依提提
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个用于钓鱼网站分类的机器学习数据集,包含1353个网站实例,具有10个整数属性特征。数据来源于Phishtank.com的钓鱼网站和Yahoo等目录的合法网站,涵盖548个合法网站、702个钓鱼URL和103个可疑URL,适用于分类任务。
以上内容由遇见数据集搜集并总结生成



