AttrScore
收藏魔搭社区2025-07-04 更新2025-07-05 收录
下载链接:
https://modelscope.cn/datasets/osunlp/AttrScore
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for AttrScore
- Repository: https://github.com/OSU-NLP-Group/AttrScore
- Paper: [Automatic Evaluation of Attribution by Large Language Models] (https://arxiv.org/pdf/2305.06311.pdf)
- Point of Contact: [Xiang Yue](mailto:yue.149@osu.edu)
### Citation Information
```bib
@article{yue2023automatic,
title={Automatic Evaluation of Attribution by Large Language Models},
author={Yue, Xiang and Wang, Boshi and Zhang, Kai and Chen, Ziru and Su, Yu and Sun, Huan},
journal={arXiv preprint arXiv:2305.06311},
year={2023}
}
```
### What's New?
In the current version 0.2, we fixed some wrong annotated labels in the AttrEval-GenSearch dataset. (Commit: [4da294f](https://huggingface.co/datasets/osunlp/AttrScore/commit/4da294f5e488086492e117b405fc8ea95717ec3b))
### Dataset Summary
A recent focus of large language model (LLM) development, as exemplified by generative search engines, is to incorporate external references to generate and support its claims. However, evaluating the attribution, i.e., verifying whether the generated statement is indeed fully supported by the cited reference, remains an open problem.
We construct this dataset, which contains both training and test data for the evaluation of attribution. The training data are repurposed from related tasks, such as question answering, fact-checking, natural language inference, and summarization. The test data, cotains a set simulated from QA datasets and a set manually curated from a generative search engine, New Bing.
## Dataset Structure
### Data Instances
{
"query": "",
"answer": "Bastedo cared for all the animals that inhabit the earth.",
"reference": "Alexandra Lendon Bastedo (9 March 1946 - 12 January 2014) was a British actress, best known for her role as secret agent Sharron Macready in the 1968 British espionage/science fiction adventure series \"The Champions\". She has been cited as a sex symbol of the 1960s and 1970s. Bastedo was a vegetarian and animal welfare advocate.",
"label": "Extrapolatory",
"dataset": "anli"
}
{
"query": The persian gulf war began when iraq invaded what country?
"answer": The Persian Gulf War began when Iraq invaded Kuwait.
"reference": First Iraq War or Iraq War, before the term \"Iraq War\" became identified instead with the 2003 Iraq War. The Iraqi Army's occupation of Kuwait that began 2 August 1990 was met with international condemnation and brought immediate economic sanctions against Iraq by members of the UN Security Council. Together with the UK's prime minister Margaret Thatcher - who had resisted the invasion by Argentina of the Falkland Islands a decade earlier - George H. W. Bush deployed US forces into Saudi Arabia, and urged other countries to send their own forces to the scene. An array of nations joined the coalition, forming the",
"label": "Attributable",
"dataset": "NaturalQuestions"
}
### Data Fields
- "query": query (may be empty)
- "answer": answer to the query
- "reference": a document or a paragraph
- "label": whether the reference can support the answer to the query ("attributable", "extrapolatory", "contradictory")
- "dataset": the original dataset of the data instance
# AttrScore 数据集卡片
- 仓库地址:https://github.com/OSU-NLP-Group/AttrScore
- 论文:[《大语言模型归因能力的自动评估》](https://arxiv.org/pdf/2305.06311.pdf)
- 联系人:[Xiang Yue](mailto:yue.149@osu.edu)
### 引用信息
bib
@article{yue2023automatic,
title={大语言模型归因能力的自动评估},
author={Yue, Xiang and Wang, Boshi and Zhang, Kai and Chen, Ziru and Su, Yu and Sun, Huan},
journal={arXiv预印本 arXiv:2305.06311},
year={2023}
}
### 更新动态
在当前的0.2版本中,我们修复了AttrEval-GenSearch数据集中部分标注错误的标签。(提交记录:[4da294f](https://huggingface.co/datasets/osunlp/AttrScore/commit/4da294f5e488086492e117b405fc8ea95717ec3b))
### 数据集概述
近年来,以生成式搜索引擎为代表的大语言模型(Large Language Model,LLM)开发的核心方向之一,是融入外部参考文本以生成并佐证其输出内容。然而,评估模型输出的归因性,即验证生成语句是否完全被所引用的参考文本所支持,仍是一项尚未解决的开放性问题。
我们构建了本数据集,其包含用于归因评估的训练数据与测试数据。训练数据源自问答、事实核查、自然语言推理、摘要生成等相关任务的复用数据。测试数据包含两部分:一部分由问答数据集模拟生成,另一部分由生成式搜索引擎New Bing手动整理得到。
## 数据集结构
### 数据实例
json
{
"query": "",
"answer": "巴斯蒂多照顾了栖息在地球上的所有动物。",
"reference": "亚历山德拉·伦登·巴斯蒂多(Alexandra Lendon Bastedo,1946年3月9日-2014年1月12日)是英国女演员,因在1968年英国间谍/科幻冒险剧集《The Champions》中饰演特工莎伦·麦克雷迪(Sharron Macready)而闻名。她被视为20世纪60、70年代的性感符号。巴斯蒂多是一名素食主义者,同时也是动物福利倡导者。",
"label": "外推性(Extrapolatory)",
"dataset": "anli"
}
{
"query": "波斯湾战争是伊拉克入侵哪个国家爆发的?",
"answer": "波斯湾战争始于伊拉克入侵科威特。",
"reference": "第一次伊拉克战争又称伊拉克战争,在"伊拉克战争"一词特指2003年伊拉克战争之前,该术语通常指代前者。1990年8月2日伊拉克军队占领科威特遭到国际社会谴责,并引发联合国安理会成员国立即对伊拉克实施经济制裁。时任英国首相玛格丽特·撒切尔——十年前曾抵制阿根廷对福克兰群岛的入侵——与乔治·H·W·布什一道,部署美军进驻沙特阿拉伯,并敦促其他国家派遣本国部队。众多国家加入了该联军阵营,形成了",
"label": "可归因(Attributable)",
"dataset": "NaturalQuestions"
}
### 数据字段
- `"query"`: 查询语句(可为空)
- `"answer"`: 查询对应的回答内容
- `"reference"`: 参考文档或段落
- `"label"`: 标注参考文本是否可支持该查询的回答,可选值为`可归因(Attributable)`、`外推性(Extrapolatory)`、`矛盾性(Contradictory)`
- `"dataset"`: 该数据实例的原始数据集来源
提供机构:
maas
创建时间:
2025-07-04



