claimify-dataset
收藏魔搭社区2025-11-12 更新2025-08-16 收录
下载链接:
https://modelscope.cn/datasets/RealmSky/claimify-dataset
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card
## Dataset Overview
This dataset is associated with the paper [Towards Effective Extraction and Evaluation of Factual Claims](https://arxiv.org/pdf/2502.10855) by Dasha Metropolitansky and Jonathan Larson, accepted to the ACL 2025 Main Conference. See also our [video](https://www.youtube.com/watch?v=WTs-Ipt0k-M) and [blog post](https://www.microsoft.com/en-us/research/blog/claimify-extracting-high-quality-claims-from-language-model-outputs/).
The dataset contains 6,490 sentences, each annotated with a binary label indicating whether it contains a verifiable factual claim. These sentences were extracted from the 396 answers in the [BingCheck dataset](https://github.com/Miaoranmmm/SelfChecker/tree/main/bingcheck) (Li et al., 2024), which contains long-form responses by a commercial search assistant to questions spanning a wide range of topics.
59% of sentences are labeled as containing a verifiable factual claim. Note that this proportion differs slightly from the number reported in the paper (63%) because, as explained in Appendix F, certain sentences were excluded from our analysis.
## Dataset Structure
The dataset has the following columns:
- `answer_id` *(string)* – unique ID for the answer in BingCheck
- `question` *(string)* – original BingCheck question
- `sentence_id` *(int)* – index of the sentence within the answer
- `sentence` *(string)* – sentence text
- `contains_factual_claim` *(bool)* – True if the sentence contains a verifiable factual claim; otherwise, False
The following is an example row:
```
{
"answer_id": "c910f021-48e2-44e0-a3fa-3552eaacf5b2",
"question": "What inspired the invention of the first artificial heart?",
"sentence_id": 3,
"sentence": "The first patient to receive the Jarvik-7 was **Barney Clark**, a dentist from Seattle, who survived for 112 days after the implantation[^2^].",
"contains_factual_claim": True
}
```
## Dataset Creation
To divide answers into sentences, we first split on newline characters, then applied NLTK’s sentence tokenizer. Annotation was performed by three employees of Microsoft Research (two of whom were not involved in the project beyond contributing annotations), following the procedure and guidelines detailed in Appendix C of the paper.
## Citation
If you use this dataset, please cite:
```
@inproceedings{metropolitansky-larson-2025-towards,
title = "Towards Effective Extraction and Evaluation of Factual Claims",
author = "Metropolitansky, Dasha and
Larson, Jonathan",
editor = "Che, Wanxiang and
Nabende, Joyce and
Shutova, Ekaterina and
Pilehvar, Mohammad Taher",
booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.acl-long.348/",
doi = "10.18653/v1/2025.acl-long.348",
pages = "6996--7045",
ISBN = "979-8-89176-251-0",
}
```
## Ethics
All data annotation was conducted with the informed consent of the study participants. No personally identifiable information is included.
# 数据集卡片
## 数据集概览
本数据集与达莎·梅特罗波坦斯基(Dasha Metropolitansky)、乔纳森·拉尔森(Jonathan Larson)合著的、已被国际计算语言学协会(Association for Computational Linguistics, ACL)2025年主会议收录的论文《面向事实性主张的有效提取与评估》(Towards Effective Extraction and Evaluation of Factual Claims)相关,论文PDF可访问[arxiv链接](https://arxiv.org/pdf/2502.10855)。此外还可参阅我们的[演示视频](https://www.youtube.com/watch?v=WTs-Ipt0k-M)与[博客文章](https://www.microsoft.com/en-us/research/blog/claimify-extracting-high-quality-claims-from-language-model-outputs/)。
本数据集包含6490条语句,每条均标注有二元标签,用以标识该语句是否包含可验证的事实性主张。这些语句取自BingCheck数据集(BingCheck dataset)(Li等,2024)中的396条回答,而BingCheck数据集收录了商用搜索助手针对多领域问题生成的长格式回复,其仓库地址为[https://github.com/Miaoranmmm/SelfChecker/tree/main/bingcheck](https://github.com/Miaoranmmm/SelfChecker/tree/main/bingcheck)。
其中59%的语句被标注为包含可验证的事实性主张。需注意,该比例与论文中报告的63%略有差异,正如附录F中所述,部分语句已被排除在本次分析之外。
## 数据集结构
本数据集包含以下字段:
- `answer_id` *(字符串)* —— BingCheck数据集中对应回答的唯一标识符
- `question` *(字符串)* —— 原始BingCheck问题
- `sentence_id` *(整数)* —— 该语句在所属回答中的索引位置
- `sentence` *(字符串)* —— 语句原文
- `contains_factual_claim` *(布尔值)* —— 若语句包含可验证的事实性主张则为True,否则为False
以下为一条示例数据行:
{
"answer_id": "c910f021-48e2-44e0-a3fa-3552eaacf5b2",
"question": "首个人工心脏的发明灵感源自何处?",
"sentence_id": 3,
"sentence": "首位接受Jarvik-7人工心脏移植的患者是**巴尼·克拉克(Barney Clark)**,他是来自西雅图的牙医,术后存活了112天[^2^]。",
"contains_factual_claim": true
}
## 数据集构建
为将回答拆分为独立语句,我们首先按换行符进行分割,随后使用自然语言工具包(Natural Language Toolkit, NLTK)的语句分词器完成处理。标注工作由微软研究院(Microsoft Research)的三名员工完成,其中两名仅参与标注环节,未涉足项目其他工作。标注流程与详细指南可参阅论文附录C。
## 引用说明
若您使用本数据集,请引用以下文献:
@inproceedings{metropolitansky-larson-2025-towards,
title = "Towards Effective Extraction and Evaluation of Factual Claims",
author = "Metropolitansky, Dasha and
Larson, Jonathan",
editor = "Che, Wanxiang and
Nabende, Joyce and
Shutova, Ekaterina and
Pilehvar, Mohammad Taher",
booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2025",
address = "奥地利维也纳",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.acl-long.348/",
doi = "10.18653/v1/2025.acl-long.348",
pages = "6996--7045",
ISBN = "979-8-89176-251-0",
}
## 伦理声明
所有数据标注工作均在研究参与者知情同意的前提下开展,且未包含任何可识别的个人信息。
提供机构:
maas
创建时间:
2025-08-14



