JuDDGES/en-appealcourt

Name: JuDDGES/en-appealcourt
Creator: JuDDGES
Published: 2025-05-15 16:11:37
License: 暂无描述

Hugging Face2025-05-15 更新2025-07-05 收录

下载链接：

https://hf-mirror.com/datasets/JuDDGES/en-appealcourt

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个用于训练和评估大型语言模型在信息抽取领域的性能的数据集，特别是在英格兰和威尔士刑事法庭的判决方面。数据集包括来自英格兰和威尔士刑事法庭判决的6,154份文档，这些文档是从JuDDGES/en-court-raw数据集中获取的。数据集被分为三个部分：测试集和标注集，每个部分包含573个样本。测试集由GPT-4.1-2025-04-14自动生成，而标注集由人工标注。每个文档包含两个列：上下文（判决全文）和输出（提取的信息）。输出列是一个JSON对象，包含各种字段，如被告、受害者、罪行、判决等详细信息。数据集的创建遵循了严格的标注指南，旨在提高标注的准确性和一致性。

Dataset for training and evaluating large language models (LLMs) for information extraction in the domain of publicly available England and Wales Court of Appeal (Criminal Division) judgments. The dataset includes 6,154 documents from the JuDDGES/en-court-raw dataset, curated from publicly available judgments from the Court of Appeal (Criminal Division). The dataset is divided into three parts: test set and annotated set, each containing 573 samples. The test set is automatically generated by GPT-4.1-2025-04-14, while the annotated set is manually annotated. Each document contains two columns: context (full text of the judgment) and output (extracted information). The output column is a JSON object containing various fields such as defendant, victim, offense, sentence, and more. The creation of the dataset follows strict annotation guidelines to improve the accuracy and consistency of annotation.

提供机构：

JuDDGES

5,000+

优质数据集

54 个

任务类型

进入经典数据集