XAI-FUNGI: Dataset from the user study on comprehensibility of XAI algorithms

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/11448394

下载链接

链接失效反馈

官方服务：

资源简介：

XAI-FUNGI: Dataset from the user study on comprehensibility of XAI algorithms We present the dataset which was created during a user study on evaluation of explainability of artificial intelligence (AI) at the Jagielloninan University as a collaborative work of computer science (GEIST team) and information sciences research groups. The main goal of the research was to explore effective explanations of AI model patterns to diverse audiences. The dataset contains material collected from 39 participants during the interviews conducted by the Information Sciences research group. The participants were recruited from 149 candidates to form three groups that represented domain experts in the field of mycology (DE), students with data science and visualization background (IT) and students from social sciences and humanities (SSH). Each group was given an explanation of a machine learning model trained to predict edible and non-edible mushrooms and asked to interpret the explanations and answer various questions during the interview. The machine learning model and explanations for its decision were prepared by the computer science research team. The resulting dataset was constructed from the surveys obtained from the candidates, anonymized transcripts of the interviews, the results from thematic analysis, and original explanations with modifications suggested by the participants. The dataset is complemented with the source code allowing one to reproduce the initial machine leaning model and explanations. The general structure of the dataset is described in the following table. The files that contain in their names [RR]_[SS]_[NN] contain the individual results obtained from particular participant. The meaning of the prefix is as follows: RR - initials of the researcher conducting the interview, SS - type of the participant (DE for domain expert, SSH for social sciences and humanities students, or IT for computer science students), NN - number of the participant File Description SURVEY.csv The results from a survey that was filled by 149 participants out of which 39 were selected to form a final group of particiapnts. SURVEY_en.csv Content of the SURVEY translated into English. CODEBOOK.csv The codebook used in thematic analysis and MAXQDA coding QUESTIONS.csv List of questions that the participants were asked during interviews. SLIDES.csv List of slides used in the study with their interpretation and reference to MAXQDA themes and VISUAL_MODIFICATIONS tables. MAXQDA_SUMMARY.csv Summary of thematic analysis performed with codes used in CODEBOOK for each participant PROBLEMS.csv List of problems that participants were asked to solve during interviews. They correspond to three instances from the dataset that the participants had to classify using knowledge gained from explanations. PROBLEMS_en.csv Content of the PROBLEMS file translated into English. PROBLEMS_RESPONSES.csv The responses to the problems for each participant to the problems listed in PROBLEMS.csv VISUALIZATION_MODIFICATIONS.csv Information on how the order of the slides was modified by the participant, which slides (explanations) were removed, and what kind of additional explanation was suggested. ORIGINAL_VISUZALIZATIONS.pdf The PDF file containing the visualization of explanations presented to the participants during the interviews ORIGINAL_VISUZALIZATIONS_EN.pdf Content of the ORIGINAL_VISUZALIZATIONS translated into English. VISUALIZATION_MODIFICATIONS.zip The PDF file containing the original slides from ORIGINAL_VISUZALIZATIONS.pdf with the modifications suggested by the participant. Each file is a PDF file named with the participant ID, i.e. [RR]_[SS]_[NN].pdf TRANSCRIPTS.zip The anonymized transcripts of interviews for each given participant, zipped into one archive. Each transcript is named after the particiapnt ID, i.e. [RR]_[SS]_[NN].csv and contains text tagged with slide number that it related to, question number from QUESTIONS.csv, and problem number from PROBLEMS.csv. The detailed structure of the files presented in the previous Table is given in the Technical info section. The source code used to train ML model and to generate explanations is available on Gitlab

创建时间：

2025-03-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集