预训练语言模型可解释性评估基准

Name: 预训练语言模型可解释性评估基准
Creator: 百度公司
Published: 2022-07-28T16:28:09+08:00

arXiv2022-07-28 更新2024-06-21 收录

预训练语言模型

可解释性评估

数据链接：

http://xyz 数据链接链接失效反馈

官方服务：

资源简介：

预训练语言模型可解释性评估基准是由百度公司开发的一个创新数据集，旨在评估预训练语言模型在多个维度上的表现，包括语法、语义、知识、推理和计算。该数据集包含中英文标注数据，通过精心设计的实例和标注的token级理由，测试模型在不同任务上的预测性能和可解释性。数据集的创建过程包括数据收集、扰动数据构建和迭代理由标注与检查，确保数据质量和多样性。该评估基准的应用领域广泛，旨在解决预训练语言模型在特定任务上的性能评估和可解释性问题，推动预训练语言模型研究的发展。

The Interpretability Evaluation Benchmark for Pre-trained Language Models is an innovative dataset developed by Baidu Inc., which aims to evaluate the performance of pre-trained language models across multiple dimensions including syntax, semantics, knowledge, reasoning and computation. This dataset contains bilingual (Chinese and English) annotated data, and tests the predictive performance and interpretability of models across different tasks via carefully designed instances and annotated token-level justifications. The dataset creation workflow includes data collection, perturbed data construction, as well as iterative justification annotation and verification, to ensure data quality and diversity. This evaluation benchmark has broad application scenarios, and is designed to address the performance evaluation and interpretability issues of pre-trained language models on specific tasks, so as to promote the development of pre-trained language model research.

提供机构：

百度公司

创建时间：

2022-07-28

预训练语言模型可解释性评估基准

资源简介：

相关数据集