five

Finding the right XAI Method --- Dataset

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7715397
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset provides the complementary preprocessed data for the training of the neural networks used in Bommer et. al. and according source code (https://github.com/philine-bommer/Climate_X_Quantus). In the publication , we introduce XAI evaluation in the context of climate research and assess different desired explanation properties, namely, robustness, faithfulness, randomization, complexity, and localization. To this end we build upon previous work (Labe and Barnes et. al. 2021) and train a multi-layer perceptron (MLP) and a convolutional neural network (CNN) to predict the decade based on annual-mean temperature maps. Following Labe and Barnes et. al. 2021, we use data simulated by the general climate model, CESM1 (Hurrell et. al. 2013). We use the global 2-m air temperature (T2m) temperature maps from 1920 to 2080. The data consist of 40 ensemble members and each member is generated by varying the atmospheric initial conditions with fixed external forcing, i.e. historical forcings are imposed from 1920 to 2005 and Representative Concentration Pathways 8.5 for the following years (Kay et. al. 2015). Following Labe and Barnes et. al. 2021, we compute annual averages and apply a bilinear interpolation. This results in T=161 temperature maps for each member, with v=144 longitude grid cells and h=95 latitude grid cells, given the 1.9° sampling in latitude and 2.5° sampling in longitude. The temperature maps are finally standardized by removing the multi-year (1920 to 2080) mean and subsequently dividing by the corresponding standard deviation. Unlike the flattened input used for the MLP (temperature maps are flattened into a vector), the CNN maintains the longitude-latitude grid of the temperature maps. Similar to Labe and Barnes et. al. 2021, for training, validation and testing we use the model data discussed above. For both MLP and CNN we consider 20% of the data as test set and the remaining 80% is split into a training (64%) and validation (16%) set. We train both networks to solve a fuzzy classification problem which combines classification and regression. In the classification setting, the network assigns each map to one of the 20 different classes, where each class corresponds to one decade between 1900 and 2100 (necessary class devision for later regression, as done by Labe and Barnes et. al. 2021). The network output thus, is a probability vector containing a probability for each class. To assess the network performance we use the monthly 2m air temperature of the 20th century Reanalysis data (V3) (Slivinski et. al. 2019) from 1920 to 2015. The dataset includes two compressed .npz-files and a Readme.md. A full description of the data contained in this dataset and instructions on the data usage are provided in the Readme-file.
创建时间:
2023-03-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作