人类注意力基准
收藏arXiv2020-06-29 更新2024-06-21 收录
下载链接:
https://github.com/SinaMohseni/ML-InterpretabilityEvaluation-Benchmark
下载链接
链接失效反馈官方服务:
资源简介:
人类注意力基准是由德克萨斯A&M大学和佛罗里达大学合作创建的一个多领域数据集,旨在通过收集多个注释者的注意力掩码来评估机器学习模型的解释性。该数据集包含图像和文本两个领域,总计约1500个样本,涵盖了PASCAL VOC 2012、ILSVRC 2014、20 Newsgroup和IMDB 50K等多个数据集。创建过程中,研究者通过亚马逊Mechanical Turk平台招募了至少1000次批准任务的注释者,确保了数据的高质量和多样性。该数据集主要用于机器学习解释性的定量评估,特别是模型解释的完整性和正确性,以解决模型透明度和用户信任的问题。
Human Attention Benchmark is a multi-domain dataset co-created by Texas A&M University and the University of Florida, which aims to evaluate the interpretability of machine learning models by collecting attention masks from multiple annotators. The dataset covers two domains, image and text, with approximately 1,500 total samples, including multiple benchmark datasets such as PASCAL VOC 2012, ILSVRC 2014, 20 Newsgroups, and IMDB 50K. During the dataset construction, researchers recruited annotators who had completed at least 1,000 approved tasks via the Amazon Mechanical Turk platform to ensure high data quality and diversity. This dataset is primarily used for the quantitative evaluation of machine learning interpretability, particularly the completeness and correctness of model explanations, to address issues related to model transparency and user trust.
提供机构:
德克萨斯A&M大学
创建时间:
2018-01-16



