ExplainTS: A Benchmark Suite for Reproducible Time-Series XAI Research

Name: ExplainTS: A Benchmark Suite for Reproducible Time-Series XAI Research
Creator: Zenodo
Published: 2026-03-08 14:02:35
License: 暂无描述

Zenodo2026-03-08 更新2026-05-26 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.18892817

下载链接

链接失效反馈

官方服务：

资源简介：

Collection of Time-Series Classification Datasets with Pretrained DL Models and Local Post-hoc Explainers (SHAP, LIME, Anchor & PHAR) Precomputed bundle of post-hoc explanations and black-box models for time-series classification. This dataset introduces ExplainTS, a comprehensive testbed containing 83 univariate and 20 multivariate time-series datasets from the UCR/UEA repository (source: TSC), each used for multiclass classification with a deep learning model. For each dataset, we provide: Precomputed train/test splits (75/25). A trained ConvLSTM-based TensorFlow model (SavedModel and .h5 formats). Post-hoc local explanations generated using four methods: Shapley Additive Explanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), Anchor, Post-hoc Attribution Rules [1] (PHAR). Impact of the dataset These datasets provide a ready-to-use, frozen benchmark layer for Explainable AI in time-series classification. Since models and explanation outputs are precomputed, researchers can immediately use them for evaluation, visualization, or developing new post-hoc XAI metrics without the massive computational overhead of retraining or re-explaining models. Comprehensive Coverage: 83 univariate and 20 multivariate UCR/UEA time-series, each with a standardized 75/25 train-test split in NumPy .pickle format. Pretrained Models: Ready-to-use ConvLSTM1D models for every dataset (requiring no custom dependencies), eliminating costly training and ensuring experimental consistency. Precomputed Explanations: Post-hoc outputs for training and test sets spanning attribution scores (DeepSHAP, LIME) and discrete rule sets (Anchor, PHAR) with confidence and coverage metadata. Living Resource: ExplainTS is designed as a community-driven resource; we actively invite researchers to contribute their locally computed explanation artifacts to future releases. Educational & Prototyping Value: Includes a ready-to-use Jupyter notebook demonstrating how to calculate XAI stability metrics and render publication-quality explanation plots directly over time-series signals. The following visualizations, generated using the included educational notebook, demonstrate a practical XAI auditing use case on the ECG5000 dataset. We compare the explanations of a reference sample against its nearest neighbor to evaluate method stability. - View SHAP attributions. Link: https://raw.githubusercontent.com/mozo64/papers/main/zenodo-ucr/results/shap_stability_casestudy_ECG5000.png - View Discrete PHAR interval rules applied to the same signals. Link: https://raw.githubusercontent.com/mozo64/papers/main/zenodo-ucr/results/phar_casestudy_comparison_ECG5000.png Companion Code & Notebooks All resources are openly available under the CC-BY-4.0 license. The linked GitHub repository (https://github.com/mozo64/papers/tree/main/zenodo-ucr) provides a suite of Python scripts and Jupyter notebooks for reproducing and interacting with the benchmark: ExplainTS_CaseStudy.ipynb — An educational case study for calculating XAI stability metrics and plotting explanations over time-series. UCR-train.ipynb — Pipeline for training the baseline ConvLSTM models. UCR-explainers-lime-shap.ipynb — Execution of DeepSHAP and LIME explainers. UCR-explainers-anchor.ipynb — Execution of the Anchor explainer with cascade retries. UCR-explainers-phar.ipynb — Execution of the PHAR rule extraction and hyperparameter optimization. datasets_summary.ipynb — Utility for extracting dataset statistics and model accuracies. Repository content train_test.zip — contains files of the form {uni|multi}_{series_name}_train_and_test.zip. Each includes: trainX.pickle, trainy.pickle, testX.pickle, testy.pickle. Format: `numpy.array` models.zip — trained models as directories in the form {uni|multi}_{series_name}_model. Each contains a TensorFlow SavedModel and .h5 file for loading flexibility. shap.zip — DeepSHAP values in {series_name}_shap_values.zip. Files: svtr.pickle (train), svts.pickle (test). Format: `numpy.array` lime.zip — LIME values in {series_name}_lime_values.zip. Files: lvtr.pickle (train), lvts.pickle (test). Format: `numpy.array` anchor.zip — Anchor rule records in {series_name}_anchor_values.zip. Files: avtr.pickle (train), avts.pickle (test). Format: `List[List[Dictionary]]` phar.zip — PHAR rules and hyperoptimization logs in {series_name}_phar_values.zip. Files: pvtr.pickle and pvts.pickle, plus phar_metadata.json and phar_trials_log.jsonl. [1] Mozolewski, M., Bobek, S., & Nalepa, G. J. (2026). Explaining Time Series Classifiers with PHAR: Rule Extraction and Fusion from Post-hoc Attributions. arXiv preprint arXiv:2508.01687. https://arxiv.org/abs/2508.01687

提供机构：

Zenodo

创建时间：

2026-03-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集