Persuasion Sentences in Spam Email (PerSentSE)

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/14585763

下载链接

链接失效反馈

官方服务：

资源简介：

How to Access: To access this dataset, please contact Francisco Janez via email at francisco.janez@unileon.es. Access will be granted based on specific requests. Purpose:The PerSentSE corpus was developed to study persuasive techniques in spam emails. It includes 130 emails randomly selected from the SpamArchive2122 dataset, which contains over 20,000 spam emails in English. Methodology: Segmentation: Emails were divided into sentences using the NLTK library. Annotation: Eight persuasive techniques, along with a "non-persuasion" class, were identified. Two expert annotators labeled an initial subset of emails to measure inter-annotator agreement, achieving a final acceptable level (γ = 0.63). Corpus Statistics: Total sentences: 1,075 Persuasive sentences: 216 (20.1%) Persuasion Distribution by Email Sections (Table 7): Subject lines: 35.59% persuasive, with an average of 1.62 techniques. Greeting section: 54.17% persuasive, averaging 1.46 techniques. Email body: 82.46% persuasive, with 5.51 techniques on average. Farewell section: 31.43% persuasive, averaging 1.45 techniques. Co-occurrence of Techniques (Figure 2):Some persuasive techniques frequently appeared together: Appeal to Fear/Prejudice with Loaded Language: 25 instances. Exaggeration/Minimization with Loaded Language: 24 instances. Appeal to Fear/Prejudice with Exaggeration/Minimization: 20 instances. Findings:The body section of emails concentrates the highest number of persuasive elements, contrary to earlier studies focusing on subject lines alone. This suggests that spam emails rely heavily on persuasive content in their main text.

创建时间：

2025-01-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集