BARDA
收藏arXiv2024-03-24 更新2024-06-21 收录
下载链接:
https://allenai.org/data/barda
下载链接
链接失效反馈官方服务:
资源简介:
BARDA是由艾伦人工智能研究所创建的一个信念与推理数据集,旨在清晰区分事实准确性和推理能力。该数据集包含3000个推论,涉及6681个真实陈述和2319个虚假陈述,用于评估语言模型在这两方面的表现。数据集通过混合真实和虚假事实,特别是包含反事实例子,来避免信念偏差。BARDA不仅用于自身研究,还作为一个新的评估工具,用于衡量其他模型的性能,特别是在解决事实准确性和推理能力分离的问题上。
BARDA is a belief and reasoning dataset developed by the Allen Institute for AI, which is designed to explicitly disentangle factual accuracy from reasoning capabilities. This dataset includes 3000 inferences covering 6681 true statements and 2319 false statements, and is used to evaluate language models' performance on these two core aspects. The dataset avoids belief bias by mixing true and false factual claims, especially by incorporating counterfactual examples. Beyond being applied to its own research, BARDA also acts as a novel evaluation tool for measuring the performance of other models, particularly when addressing the task of separating factual accuracy and reasoning capabilities.
提供机构:
艾伦人工智能研究所
创建时间:
2023-12-13



