ExplainTS: A Benchmark Suite for Reproducible Time-Series XAI Research
收藏Zenodo2026-03-08 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.18892817
下载链接
链接失效反馈官方服务:
资源简介:
Collection of Time-Series Classification Datasets with Pretrained DL Models and Local Post-hoc Explainers (SHAP, LIME, Anchor & PHAR)
Precomputed bundle of post-hoc explanations and black-box models for time-series classification.
This dataset introduces ExplainTS, a comprehensive testbed containing 83 univariate and 20 multivariate time-series datasets from the UCR/UEA repository (source: TSC), each used for multiclass classification with a deep learning model. For each dataset, we provide:
Precomputed train/test splits (75/25).
A trained ConvLSTM-based TensorFlow model (SavedModel and .h5 formats).
Post-hoc local explanations generated using four methods:
Shapley Additive Explanations (SHAP),
Local Interpretable Model-agnostic Explanations (LIME),
Anchor,
Post-hoc Attribution Rules [1] (PHAR).
Impact of the dataset
These datasets provide a ready-to-use, frozen benchmark layer for Explainable AI in time-series classification. Since models and explanation outputs are precomputed, researchers can immediately use them for evaluation, visualization, or developing new post-hoc XAI metrics without the massive computational overhead of retraining or re-explaining models.
Comprehensive Coverage: 83 univariate and 20 multivariate UCR/UEA time-series, each with a standardized 75/25 train-test split in NumPy .pickle format.
Pretrained Models: Ready-to-use ConvLSTM1D models for every dataset (requiring no custom dependencies), eliminating costly training and ensuring experimental consistency.
Precomputed Explanations: Post-hoc outputs for training and test sets spanning attribution scores (DeepSHAP, LIME) and discrete rule sets (Anchor, PHAR) with confidence and coverage metadata.
Living Resource: ExplainTS is designed as a community-driven resource; we actively invite researchers to contribute their locally computed explanation artifacts to future releases.
Educational & Prototyping Value: Includes a ready-to-use Jupyter notebook demonstrating how to calculate XAI stability metrics and render publication-quality explanation plots directly over time-series signals.
The following visualizations, generated using the included educational notebook, demonstrate a practical XAI auditing use case on the ECG5000 dataset. We compare the explanations of a reference sample against its nearest neighbor to evaluate method stability.
- View SHAP attributions. Link: https://raw.githubusercontent.com/mozo64/papers/main/zenodo-ucr/results/shap_stability_casestudy_ECG5000.png
- View Discrete PHAR interval rules applied to the same signals. Link: https://raw.githubusercontent.com/mozo64/papers/main/zenodo-ucr/results/phar_casestudy_comparison_ECG5000.png
Companion Code & Notebooks
All resources are openly available under the CC-BY-4.0 license. The linked GitHub repository (https://github.com/mozo64/papers/tree/main/zenodo-ucr) provides a suite of Python scripts and Jupyter notebooks for reproducing and interacting with the benchmark:
ExplainTS_CaseStudy.ipynb — An educational case study for calculating XAI stability metrics and plotting explanations over time-series.
UCR-train.ipynb — Pipeline for training the baseline ConvLSTM models.
UCR-explainers-lime-shap.ipynb — Execution of DeepSHAP and LIME explainers.
UCR-explainers-anchor.ipynb — Execution of the Anchor explainer with cascade retries.
UCR-explainers-phar.ipynb — Execution of the PHAR rule extraction and hyperparameter optimization.
datasets_summary.ipynb — Utility for extracting dataset statistics and model accuracies.
Repository content
train_test.zip — contains files of the form {uni|multi}_{series_name}_train_and_test.zip. Each includes: trainX.pickle, trainy.pickle, testX.pickle, testy.pickle. Format: `numpy.array`
models.zip — trained models as directories in the form {uni|multi}_{series_name}_model. Each contains a TensorFlow SavedModel and .h5 file for loading flexibility.
shap.zip — DeepSHAP values in {series_name}_shap_values.zip. Files: svtr.pickle (train), svts.pickle (test). Format: `numpy.array`
lime.zip — LIME values in {series_name}_lime_values.zip. Files: lvtr.pickle (train), lvts.pickle (test). Format: `numpy.array`
anchor.zip — Anchor rule records in {series_name}_anchor_values.zip. Files: avtr.pickle (train), avts.pickle (test). Format: `List[List[Dictionary]]`
phar.zip — PHAR rules and hyperoptimization logs in {series_name}_phar_values.zip. Files: pvtr.pickle and pvts.pickle, plus phar_metadata.json and phar_trials_log.jsonl.
[1] Mozolewski, M., Bobek, S., & Nalepa, G. J. (2026). Explaining Time Series Classifiers with PHAR: Rule Extraction and Fusion from Post-hoc Attributions. arXiv preprint arXiv:2508.01687. https://arxiv.org/abs/2508.01687
提供机构:
Zenodo
创建时间:
2026-03-08



