ScamGen
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/dkypjhkmgb
下载链接
链接失效反馈官方服务:
资源简介:
ScamGen: A Comprehensive Dataset of Chinese Telephone Scams
This dataset, created using the ScamGen technique, focuses on capturing the psychological dynamics between scammers and victims in Chinese telephone scams. It is derived from a multi-source data collection framework and is expanded through a template-based data augmentation method, generating diverse and realistic scam scenarios. The dataset emphasizes the interactions between scammers and victims, using sentence- and word-level perturbations to ensure a wide variety of scam types and techniques.
This rich dataset covers various scam strategies, such as urgency, impersonation, and emotional manipulation, designed to simulate the real-life psychological tactics employed by scammers. It has been rigorously evaluated and proven to outperform large language models in generating diverse and high-quality scam-related data.
Alongside this dataset, five deep learning models for intent detection were developed, with BERT achieving a precision of 86.68%. This dataset is a valuable resource for researchers and practitioners in the fields of cybersecurity and fraud detection, enabling a deeper understanding of telephone scammer tactics and aiding in the development of more effective detection systems.
创建时间:
2024-09-16



