five

Persuasion Sentences in Spam Email (PerSentSE)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14585763
下载链接
链接失效反馈
官方服务:
资源简介:
How to Access: To access this dataset, please contact Francisco Janez via email at francisco.janez@unileon.es. Access will be granted based on specific requests. Purpose:The PerSentSE corpus was developed to study persuasive techniques in spam emails. It includes 130 emails randomly selected from the SpamArchive2122 dataset, which contains over 20,000 spam emails in English. Methodology: Segmentation: Emails were divided into sentences using the NLTK library. Annotation: Eight persuasive techniques, along with a "non-persuasion" class, were identified. Two expert annotators labeled an initial subset of emails to measure inter-annotator agreement, achieving a final acceptable level (γ = 0.63). Corpus Statistics: Total sentences: 1,075 Persuasive sentences: 216 (20.1%) Persuasion Distribution by Email Sections (Table 7): Subject lines: 35.59% persuasive, with an average of 1.62 techniques. Greeting section: 54.17% persuasive, averaging 1.46 techniques. Email body: 82.46% persuasive, with 5.51 techniques on average. Farewell section: 31.43% persuasive, averaging 1.45 techniques. Co-occurrence of Techniques (Figure 2):Some persuasive techniques frequently appeared together: Appeal to Fear/Prejudice with Loaded Language: 25 instances. Exaggeration/Minimization with Loaded Language: 24 instances. Appeal to Fear/Prejudice with Exaggeration/Minimization: 20 instances. Findings:The body section of emails concentrates the highest number of persuasive elements, contrary to earlier studies focusing on subject lines alone. This suggests that spam emails rely heavily on persuasive content in their main text.
创建时间:
2025-01-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作