Korean Voice Phishing Detection Dataset with Multilingual Back-Translation and SMOTE Augmentations
收藏ieee-dataport.org2025-03-26 收录
下载链接:
https://ieee-dataport.org/documents/korean-voice-phishing-detection-dataset-multilingual-back-translation-and-smote
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains original and augmented versions of the Korean Call Content Vishing (KorCCVi v2) dataset used in the study titled, "Enhancing Voice Phishing Detection Using Multilingual Back-Translation and SMOTE: An Empirical Study." The dataset addresses challenges of data imbalance and asymmetry in Korean voice phishing detection, leveraging data augmentation techniques such as multilingual back-translation (BT) with English, Chinese, and Japanese as intermediate languages, and Synthetic Minority Oversampling Technique (SMOTE). The augmented dataset provides a valuable resource for machine learning (ML) and deep learning (DL) applications in natural language processing (NLP) and cybersecurity research.
本数据集收录了原始及增强版的韩语电话钓鱼诈骗内容数据集(KorCCVi v2),该数据集被应用于名为《通过多语言回译与SMOTE增强语音钓鱼检测:一项实证研究》的研究中。该数据集旨在解决韩语语音钓鱼检测中数据不平衡和不对称的挑战,并利用了如多语言回译(BT)等数据增强技术,其中英语、中文和日语作为中间语言,以及合成少数类过采样技术(SMOTE)。增强后的数据集为自然语言处理(NLP)和网络安全研究中的机器学习(ML)与深度学习(DL)应用提供了宝贵的资源。
提供机构:
ieee-dataport.org



