RevaHQ/data-phishing-detection
收藏Hugging Face2024-10-24 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/RevaHQ/data-phishing-detection
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
---
# data-phishing-detection
A dataset to test methods to detect phishing emails
The file `data.parquet` contains the dataset, 400 emails. 200 are synthetic phishing attempts and 200 are synthetic regular emails.
## Schema
input - an email, synthesized by an LLM, that is either a phishing attempt or a regular email.
output - 'Yes' if the email is a phishing attempt, 'No' otherwise.
## Prompt
The `prompt.md` file contains a prompt that can be used with an LLM as a starting point for detection. Append an example to the prompt for an LLM to classify.
## Use
This data is intended for research purposes only and should not be used in any other context.
## Availability
- [Hugging Face](https://huggingface.co/datasets/RevaHQ/data-phishing-detection)
- [GitHub](https://github.com/RevaHQ/data-phishing-detection)
---
语言:
- 英语
---
# 钓鱼邮件检测数据集(data-phishing-detection)
本数据集用于测试钓鱼邮件检测相关方法。
数据集存储于`data.parquet`文件中,共包含400封邮件:其中200封为人工合成的钓鱼邮件样本,剩余200封为人工合成的正常邮件样本。
## 数据结构
`input`:由大语言模型(LLM)生成的邮件,可归类为钓鱼邮件或正常邮件两类。
`output`:若该邮件为钓鱼邮件则标注为'Yes',否则标注为'No'。
## 提示词
`prompt.md`文件中包含可用于大语言模型(LLM)钓鱼邮件检测任务的初始提示词,可向该提示词中追加示例以辅助大语言模型完成分类任务。
## 使用限制
本数据集仅可用于学术研究用途,不得用于其他任何场景。
## 获取途径
- [Hugging Face](https://huggingface.co/datasets/RevaHQ/data-phishing-detection)
- [GitHub](https://github.com/RevaHQ/data-phishing-detection)
提供机构:
RevaHQ



