XAI-FUNGI: Dataset from the user study on comprehensibility of XAI algorithms
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11448394
下载链接
链接失效反馈官方服务:
资源简介:
XAI-FUNGI: Dataset from the user study on comprehensibility of XAI algorithms
We present the dataset which was created during a user study on evaluation of explainability of artificial intelligence (AI) at the Jagielloninan University as a collaborative work of computer science (GEIST team) and information sciences research groups. The main goal of the research was to explore effective explanations of AI model patterns to diverse audiences.
The dataset contains material collected from 39 participants during the interviews conducted by the Information Sciences research group. The participants were recruited from 149 candidates to form three groups that represented domain experts in the field of mycology (DE), students with data science and visualization background (IT) and students from social sciences and humanities (SSH). Each group was given an explanation of a machine learning model trained to predict edible and non-edible mushrooms and asked to interpret the explanations and answer various questions during the interview. The machine learning model and explanations for its decision were prepared by the computer science research team.
The resulting dataset was constructed from the surveys obtained from the candidates, anonymized transcripts of the interviews, the results from thematic analysis, and original explanations with modifications suggested by the participants. The dataset is complemented with the source code allowing one to reproduce the initial machine leaning model and explanations.
The general structure of the dataset is described in the following table. The files that contain in their names [RR]_[SS]_[NN] contain the individual results obtained from particular participant. The meaning of the prefix is as follows:
RR - initials of the researcher conducting the interview,
SS - type of the participant (DE for domain expert, SSH for social sciences and humanities students, or IT for computer science students),
NN - number of the participant
File
Description
SURVEY.csv
The results from a survey that was filled by 149 participants out of which 39 were selected to form a final group of particiapnts.
SURVEY_en.csv
Content of the SURVEY translated into English.
CODEBOOK.csv
The codebook used in thematic analysis and MAXQDA coding
QUESTIONS.csv
List of questions that the participants were asked during interviews.
SLIDES.csv
List of slides used in the study with their interpretation and reference to MAXQDA themes and VISUAL_MODIFICATIONS tables.
MAXQDA_SUMMARY.csv
Summary of thematic analysis performed with codes used in CODEBOOK for each participant
PROBLEMS.csv
List of problems that participants were asked to solve during interviews. They correspond to three instances from the dataset that the participants had to classify using knowledge gained from explanations.
PROBLEMS_en.csv
Content of the PROBLEMS file translated into English.
PROBLEMS_RESPONSES.csv
The responses to the problems for each participant to the problems listed in PROBLEMS.csv
VISUALIZATION_MODIFICATIONS.csv
Information on how the order of the slides was modified by the participant, which slides (explanations) were removed, and what kind of additional explanation was suggested.
ORIGINAL_VISUZALIZATIONS.pdf
The PDF file containing the visualization of explanations presented to the participants during the interviews
ORIGINAL_VISUZALIZATIONS_EN.pdf
Content of the ORIGINAL_VISUZALIZATIONS translated into English.
VISUALIZATION_MODIFICATIONS.zip
The PDF file containing the original slides from ORIGINAL_VISUZALIZATIONS.pdf with the modifications suggested by the participant. Each file is a PDF file named with the participant ID, i.e. [RR]_[SS]_[NN].pdf
TRANSCRIPTS.zip
The anonymized transcripts of interviews for each given participant, zipped into one archive. Each transcript is named after the particiapnt ID, i.e. [RR]_[SS]_[NN].csv and contains text tagged with slide number that it related to, question number from QUESTIONS.csv, and problem number from PROBLEMS.csv.
The detailed structure of the files presented in the previous Table is given in the Technical info section.
The source code used to train ML model and to generate explanations is available on Gitlab
创建时间:
2025-03-07



