Dizzzy0x00/LLMGen-Phishing-Email-Dataset

Name: Dizzzy0x00/LLMGen-Phishing-Email-Dataset
Creator: Dizzzy0x00
Published: 2025-12-13 06:09:21
License: 暂无描述

Hugging Face2025-12-13 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/Dizzzy0x00/LLMGen-Phishing-Email-Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含使用大型语言模型（LLMs）生成的钓鱼邮件和合法邮件，其中中文邮件由DeepSeek生成，英文邮件由OpenAI模型生成。数据集的主要目的是促进钓鱼邮件检测和分类的研究与开发。数据集结构包含两个关键列：content（邮件全文内容）和label（二进制分类标签，0表示合法邮件，1表示钓鱼邮件）。数据生成过程采用先进的LLMs，旨在生成多样且真实的邮件样本，特别适用于在真实标记数据稀缺或敏感的情况下训练和评估钓鱼检测的机器学习模型。

This dataset comprises a collection of phishing and legitimate emails generated using Large Language Models (LLMs), specifically DeepSeek for Chinese emails and OpenAI models for English emails. The primary purpose of this dataset is to facilitate research and development in phishing email detection and classification. The dataset is structured with two key columns: content (the full text content of the email) and label (a binary classification label indicating the nature of the email: 0 for legitimate, 1 for phishing). The emails are synthetically generated using advanced LLMs, allowing for the creation of diverse and realistic email samples, which can be particularly useful for training and evaluating machine learning models for phishing detection, especially in scenarios where real-world labeled data is scarce or sensitive.

提供机构：

Dizzzy0x00

5,000+

优质数据集

54 个

任务类型

进入经典数据集