Transformation framework for explaining complex machine learning models using data enhancement techniques
收藏DataCite Commons2025-02-06 更新2025-04-16 收录
下载链接:
http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/TU.the.2024.120
下载链接
链接失效反馈官方服务:
资源简介:
Artificial Intelligence continues to be a rapidly evolving and influentialtechnology in various aspects of our lives, but there are several existing problems inpractical usage. Among that, Bias prevention and explainability are two crucial aspectsof responsible AI development and deployment. This work proposes a method of a dataimputation which is a data preprocessing technique to handle missing values in a datasetas missing data can introduce bias into the analysis, particularly when the missing data isnot missing completely at random. The proposed method is designed based on bee algorithm and the use of k-nearest neighborhood with linear regression to guide on findingthe appropriate imputed data from remaining data instead of random data generation.In the selection process for imputation values, the GINI importance score is employed.The imputed values have shown to enhance the discriminative power in classificationtasks. Experimental results demonstrated that the proposed method achieved the highest accuracy across all datasets compared to other methods. When compared to theoriginal dataset, the classification model from the imputed datasets yielded a 15-25%increase in prediction accuracy. For explainability, this research presents the generationmethod for an explainable model based on the given information of a black box modelusing a concept of knowledge transfer to synthesize a dataset. The concept of explain ability in AI refers to AI systems that allows for the explanation of their decisions andactions understandable to why a particular decision was made and providing insightsinto how the system processes data and arrives at conclusions. This work proposes amethod to generate explainable model based on the concept of knowledge transfer inspired by Knowledge Distillation Architecture which considers knowledge of a largeand complex model as a teacher model to generate a smaller and simple model as a student model but containing significant knowledge and having similar performance to theoriginal model. The method aims to synthesize a dataset based on information givenin the complex model and uses it to train for a prediction model that is explainable andunderstandable. The technique used in data synthesis is the Knowledge Transfer withBee Algorithm. The result of the experiment indicates that knowledge transfer fromBee-based data synthesis performs better than GAN in terms of the coefficient of determination ????2. Additionally, experimental result to evaluate accuracy shows that F1score from explainable models from the Bee-based method are closed to F1 score froma model generated from the original dataset.
提供机构:
Thammasat University
创建时间:
2025-02-06



