five

SMS Fraud Classification dataset for Chichewa

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14607453
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains 676 SMSs in Chichewa and it was used to experiment with machine learning models for fraud classifciation. There are in total six version of the dataset: D-CHI contains SMSs in Chichewa, D-HT contains a human translated version of D-CHI, and D-MT is a machine translation using google translation of D-CHI. These datasets are all balanced: they contain an equal number of fraudulent and normal SMSs. Three extended datasets of 148 SMSs each was also used that contained only normal SMSs. When added to the three datasets we obtained extended unbalance versions demoted as D-CHIe, D-HTe and D-MTe.  The attached paper explains the methodology used. Please note that the github repo and this dataset are private but wil be made public with the publication of the results from the dataset, we expect this to happen in the next few months. In the meantime, if you are interested in working with this dataset please contact us.
创建时间:
2025-01-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作