"Multi-Source Phishing, Spam, Ham Email Dataset with SPF, URL, and Structural Features"
收藏DataCite Commons2026-05-04 更新2026-05-04 收录
下载链接:
https://ieee-dataport.org/documents/multi-source-phishing-spam-ham-email-dataset-spf-url-and-structural-features-0
下载链接
链接失效反馈官方服务:
资源简介:
"This dataset aggregates email samples from five publicly available sources \u2013 the Nazario Phishing Corpus, MeAJOR Corpus, Enron Fraud Email Dataset, realprogrammersusevim\/email-dataset, and rf-peixoto\/phishing_pot \u2013 into a unified, normalized CSV designed for phishing and spam detection research. Each record is labeled as phishing, spam, or legitimate, and enriched with extracted features including URL statistics, attachment indicators, SPF authentication results, and urgency score derived from linguistic cues. The dataset integrates raw emails from diverse formats including CSV, mbox, and .eml files and processes them through a three stage pipeline of downloading, combining, and normalization. During normalization, additional cleaning, feature extraction, and standardization are applied to ensure consistency across sources. "
提供机构:
IEEE DataPort
创建时间:
2026-05-04



