Performance measure after applying SMOTE+ENN.

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://figshare.com/articles/dataset/Performance_measure_after_applying_SMOTE_ENN_/25948759

下载链接

链接失效反馈

官方服务：

资源简介：

Thyroid disease classification plays a crucial role in early diagnosis and effective treatment of thyroid disorders. Machine learning (ML) techniques have demonstrated remarkable potential in this domain, offering accurate and efficient diagnostic tools. Most of the real-life datasets have imbalanced characteristics that hamper the overall performance of the classifiers. Existing data balancing techniques process the whole dataset at a time that sometimes causes overfitting and underfitting. However, the complexity of some ML models, often referred to as “black boxes,” raises concerns about their interpretability and clinical applicability. This paper presents a comprehensive study focused on the analysis and interpretability of various ML models for classifying thyroid diseases. In our work, we first applied a new data-balancing mechanism using a clustering technique and then analyzed the performance of different ML algorithms. To address the interpretability challenge, we explored techniques for model explanation and feature importance analysis using eXplainable Artificial Intelligence (XAI) tools globally as well as locally. Finally, the XAI results are validated with the domain experts. Experimental results have shown that our proposed mechanism is efficient in diagnosing thyroid disease and can explain the models effectively. The findings can contribute to bridging the gap between adopting advanced ML techniques and the clinical requirements of transparency and accountability in diagnostic decision-making.

甲状腺疾病分类对于甲状腺疾病的早期诊断与有效治疗均具有至关重要的意义。机器学习（Machine Learning, ML）技术在该领域展现出显著潜力，可提供精准高效的诊断工具。现实场景中的多数数据集存在类别不平衡特性，这会削弱分类器的整体性能。现有的数据平衡技术通常一次性处理完整数据集，该操作有时会引发过拟合与欠拟合问题。此外，部分机器学习模型常被称为“黑箱”，其复杂性引发了学界对其可解释性与临床适用性的担忧。本研究围绕甲状腺疾病分类所用的各类机器学习模型的分析与可解释性开展了综合性研究。在研究中，我们首先采用基于聚类技术的新型数据平衡机制，随后分析了不同机器学习算法的性能表现。为应对可解释性难题，我们分别从全局与局部层面，借助可解释人工智能（eXplainable Artificial Intelligence, XAI）工具探索了模型解释与特征重要性分析方法。最后，我们邀请领域专家对可解释人工智能的分析结果进行了验证。实验结果表明，我们提出的机制在甲状腺疾病诊断中具备高效性，且能够对模型决策进行有效解释。本研究成果有助于弥合先进机器学习技术的临床应用，与诊断决策过程中对透明度及可问责性的临床要求之间的差距。

创建时间：

2024-05-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集