RevaHQ/data-phishing-detection

Name: RevaHQ/data-phishing-detection
Creator: RevaHQ
Published: 2024-10-24 12:03:52
License: 暂无描述

Hugging Face2024-10-24 更新2025-11-15 收录

下载链接：

https://hf-mirror.com/datasets/RevaHQ/data-phishing-detection

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en --- # data-phishing-detection A dataset to test methods to detect phishing emails The file `data.parquet` contains the dataset, 400 emails. 200 are synthetic phishing attempts and 200 are synthetic regular emails. ## Schema input - an email, synthesized by an LLM, that is either a phishing attempt or a regular email. output - 'Yes' if the email is a phishing attempt, 'No' otherwise. ## Prompt The `prompt.md` file contains a prompt that can be used with an LLM as a starting point for detection. Append an example to the prompt for an LLM to classify. ## Use This data is intended for research purposes only and should not be used in any other context. ## Availability - [Hugging Face](https://huggingface.co/datasets/RevaHQ/data-phishing-detection) - [GitHub](https://github.com/RevaHQ/data-phishing-detection)

--- 语言： - 英语 --- # 钓鱼邮件检测数据集（data-phishing-detection）本数据集用于测试钓鱼邮件检测相关方法。数据集存储于`data.parquet`文件中，共包含400封邮件：其中200封为人工合成的钓鱼邮件样本，剩余200封为人工合成的正常邮件样本。 ## 数据结构 `input`：由大语言模型（LLM）生成的邮件，可归类为钓鱼邮件或正常邮件两类。 `output`：若该邮件为钓鱼邮件则标注为'Yes'，否则标注为'No'。 ## 提示词 `prompt.md`文件中包含可用于大语言模型（LLM）钓鱼邮件检测任务的初始提示词，可向该提示词中追加示例以辅助大语言模型完成分类任务。 ## 使用限制本数据集仅可用于学术研究用途，不得用于其他任何场景。 ## 获取途径 - [Hugging Face](https://huggingface.co/datasets/RevaHQ/data-phishing-detection) - [GitHub](https://github.com/RevaHQ/data-phishing-detection)

提供机构：

RevaHQ

5,000+

优质数据集

54 个

任务类型

进入经典数据集