LegalCore
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/weikangda/legalcore/tree/main/data
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为LegalCore,是首个在法律领域标注了全面的事件及事件共指信息的数据集。它包含了100份法律合同文件,总计有23,183个事件提及。该数据集对目前最先进的开源和专有大型语言模型(LLM)提出了重大挑战,事件提及的标注者间一致性达到80.2%,并且在本地和非本地共指链接方面拥有多样的统计数据。数据集规模为100份法律合同文件,每份文件平均含有25,000个词元。其任务旨在进行事件检测和事件共指解析。
This dataset, named LegalCore, is the first legal-domain dataset annotated with comprehensive event and event coreference information. It includes 100 legal contract documents, with a total of 23,183 event mentions. This dataset poses substantial challenges to state-of-the-art open-source and proprietary large language models (LLMs). The inter-annotator agreement for event mentions reaches 80.2%, and it encompasses diverse statistical metrics for local and non-local coreference links. Comprising 100 legal contract documents overall, the dataset has an average of 25,000 tokens per document. The core tasks of this dataset are event detection and event coreference resolution.
提供机构:
CUAD dataset



