Software and data underlying the publication: "Natural Language Counterfactual Explanations in Financial Text Classification: A Comparison of Generators and Evaluation Metrics"

Name: Software and data underlying the publication: "Natural Language Counterfactual Explanations in Financial Text Classification: A Comparison of Generators and Evaluation Metrics"
Creator: Dobiczek, Karol
Published: 2025-11-18 00:00:00
License: 暂无描述

4TU.ResearchData2025-11-18 更新2026-04-23 收录

下载链接：

https://data.4tu.nl/datasets/7270e8b5-134a-4939-b614-158a7d225622/1

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset contains the data collected through experiments, surveys, and analyzed results obtained for the ACL GEM^2 2025 workshop submission titled Natural Language Counterfactual Explanations in Financial Text Classification: A Comparison of Generators and Evaluation Metrics. This project aimed to use texts from expert domains in order to evaluate state-of-the-art methods for generating text counterfactual explanations for large language model text classification. The data contains pre-processed texts from a financial dataset "Trillion Dollar Words", the counterfactuals generated in the experiments, as well raw and pre-processed results of the metric-based and human annotation-based experiments. Additionally, we include the software used in generating our results.

本数据集收录了为提交至ACL GEM² 2025工作坊的论文《金融文本分类中的自然语言反事实解释：生成器与评估指标对比》所收集的实验、调研数据与分析结果。本研究旨在依托专业领域文本，评估面向大语言模型（Large Language Model，LLM）文本分类任务生成文本反事实解释的前沿方法。数据集包含源自金融数据集"Trillion Dollar Words"的预处理文本、实验生成的反事实解释样本，以及基于指标与人工标注的两类实验的原始与预处理结果。此外，本数据集还附带了本次研究生成实验结果所用的配套软件。

提供机构：

Dobiczek, Karol

创建时间：

2025-11-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集