jhu-clsp/SARA
收藏数据集概述
数据集名称: SARA
版本: v1
语言: 英语
标签: 法律, 税务, 自然语言推理, 问答
规模: 小于1000条记录
数据集描述
目的: 用于税务法律中的法定推理、蕴含和问答
联系人: nils.holzenberger@telecom-paris.fr
数据集总结
引用信息:
@inproceedings{Holzenberger2020ADF, title={A Dataset for Statutory Reasoning in Tax Law Entailment and Question Answering}, author={Nils Holzenberger and Andrew Blair-Stanek and Benjamin Van Durme}, booktitle={NLLP@KDD}, year={2020} }
支持的任务和排行榜
任务:
- 问答
- 自然语言推理
数据集划分: 包含训练集和测试集,无官方排行榜。
数据集结构
数据实例
示例:
{ "id": "s151_a_neg", "text": "Alices income in 2015 is $100000. She gets one exemption of $2000 for the year 2015 under section 151(c). Alice is not married.", "question": "Alices total exemption for 2015 under section 151(a) is equal to $6000", "answer": "Contradiction", "facts": ":- discontiguous s151_c/4. :- [statutes/prolog/init]. income_(alice_makes_money). agent_(alice_makes_money,alice). start_(alice_makes_money,"2015-01-01"). end_(alice_makes_money,"2015-12-31"). amount_(alice_makes_money,100000). s151_c(alice,_,2000,2015).", "test": ":- + s151_a(alice,6000,2015)." }
数据字段
id: 唯一标识符,指示案件编号和相关法规。text: 法律案件的背景详情。question: 问题或假设。answer: 问题的答案或NLI判断(蕴含/矛盾)。facts: 案件相关事实,使用Prolog表示。test: 相关执行代码,使用Prolog表示。
数据划分
数据集划分可通过以下代码访问: python from datasets import load_dataset qa_test = load_dataset("jhu-clsp/SARA", "qa", split="test") qa_train = load_dataset("jhu-clsp/SARA", "qa", split="train") nli_test = load_dataset("jhu-clsp/SARA", "nli", split="test") nli_train = load_dataset("jhu-clsp/SARA", "nli", split="train")



