ContractNLI (ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts)
收藏OpenDataLab2026-05-31 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/ContractNLI
下载链接
链接失效反馈官方服务:
资源简介:
ContractNLI 是用于合同文档级自然语言推理 (NLI) 的数据集,其目标是自动化/支持耗时的合同审查程序。在此任务中,系统被赋予一组假设(例如“协议的某些义务可能会在终止后继续存在。”)和合同,并要求系统对每个假设是否包含、矛盾或未提及进行分类(中立于)合同,以及在合同中确定决定的证据。
ContractNLI 是第一个将 NLI 用于合同的数据集,也是最大的带注释合同语料库(截至 2021 年 9 月)。 ContractNLI 是一个有趣的挑战,从机器学习的角度(标签分布不平衡,它自然是多任务,同时训练数据一直稀缺)和语言的角度(合约的语言特征,特别是异常的否定) ,使问题变得困难)。
ContractNLI is a dataset for contractual document-level natural language inference (NLI), which aims to automate or support time-consuming contract review workflows. In this task, the system is provided with a set of hypotheses (e.g., "Certain obligations under the agreement may survive termination.") and a contract, and is required to classify each hypothesis as entailed by, contradictory to, or neutral (unmentioned) relative to the contract, as well as identify the supporting evidence for the decision within the contract.
ContractNLI is the first dataset that applies NLI to contract-related tasks, and also the largest annotated contract corpus as of September 2021. ContractNLI poses an interesting challenge, both from a machine learning perspective (featuring imbalanced label distribution, its inherent multi-task nature, and the persistent scarcity of training data) and from a linguistic perspective (the distinctive linguistic features of contracts, particularly unusual negation structures, which make the problem more difficult).
提供机构:
OpenDataLab
创建时间:
2022-05-23
搜集汇总
数据集介绍

背景与挑战
背景概述
ContractNLI是一个专注于合同文档级自然语言推理的数据集,用于自动化合同审查,包含假设分类和证据确定任务。它是首个合同NLI数据集,也是最大的带注释合同语料库之一,具有标签不平衡和多任务特点。
以上内容由遇见数据集搜集并总结生成



