Data mining approaches for toxic chat detection and churn prediction in online gaming environments
收藏DataCite Commons2025-09-07 更新2026-05-04 收录
下载链接:
http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/TU.the.2024.577
下载链接
链接失效反馈官方服务:
资源简介:
This research developed and evaluated specialized data mining techniques for toxic chat detection and user churn prediction to address problems that negatively impact user experience in online gaming environments. The first study proposed a hybrid model combining BERT's contextual understanding capabilities with Character-level CNN (CharCNN)'s character-level pattern recognition abilities to improve toxic chat detection performance. The second study enhanced user churn prediction accuracy by systematically classifying non-login periods and applying appropriate data imputation techniques.For toxic chat detection, experiments were conducted using chat messages from the Dota 2 game. The proposed BERT+CharCNN hybrid model achieved an F1 score of 0.9205, outperforming all existing single models (TF-IDF+LR: 0.8678, BERT: 0.8920, CharCNN: 0.8963). Particularly, the model demonstrated the ability to accurately detect disguised toxic expressions using creative spellings or special characters ("ff!@3ckk you") with high confidence (0.9266-0.9794).For user churn prediction, data from Blade & Soul MMORPG was utilized. Through rule-based classification based on First Active Week (FAW), various causes of non-login periods were distinguished, and missing values were handled using Multiple Imputation by Chained Equations (MICE)techniques. The optimal configuration achieved a weighted F1 score of 0.7065, showing approximately 3% performance improvement over the baseline. Statistical significance was confirmed through Friedman tests.The main contributions of this research are as follows. First, a hybrid toxic detection model that effectively integrates contextual understanding and character-level pattern recognition was proposed. Second, a novel churn prediction approach that reinterprets non-login periods with behavioral meaning was developed.Third, through both studies, a comprehensive user management approach considering both short-term gaming experience improvement and long-term user retention was presented.The research results can be utilized for creating healthier communities and establishing effective user retention strategies in the online gaming industry, providing practical guidelines for game operators to develop personalized moderation and retention strategies. Future research requires improving model interpretability, expanding to various game genres, and supporting multiple languages.
提供机构:
Thammasat University
创建时间:
2025-09-07



