datht/vlegal

Name: datht/vlegal
Creator: datht
Published: 2026-04-15 04:16:21
License: 暂无描述

Hugging Face2026-04-15 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/datht/vlegal

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: instruction dtype: string - name: question dtype: string - name: answers dtype: string - name: ground_truth dtype: string - name: _task dtype: string configs: - config_name: default data_files: - split: test path: data/vlegal_bench_full.jsonl - config_name: task_1_1 data_files: - split: test path: data/task_1_1.jsonl - config_name: task_1_2 data_files: - split: test path: data/task_1_2.jsonl - config_name: task_1_3 data_files: - split: test path: data/task_1_3.jsonl - config_name: task_1_4 data_files: - split: test path: data/task_1_4.jsonl - config_name: task_1_5 data_files: - split: test path: data/task_1_5.jsonl - config_name: task_2_1 data_files: - split: test path: data/task_2_1.jsonl - config_name: task_2_2 data_files: - split: test path: data/task_2_2.jsonl - config_name: task_2_3 data_files: - split: test path: data/task_2_3.jsonl - config_name: task_2_4 data_files: - split: test path: data/task_2_4.jsonl - config_name: task_2_5 data_files: - split: test path: data/task_2_5.jsonl - config_name: task_3_1 data_files: - split: test path: data/task_3_1.jsonl - config_name: task_3_2 data_files: - split: test path: data/task_3_2.jsonl - config_name: task_3_3 data_files: - split: test path: data/task_3_3.jsonl - config_name: task_3_4 data_files: - split: test path: data/task_3_4.jsonl - config_name: task_3_5 data_files: - split: test path: data/task_3_5.jsonl - config_name: task_4_1 data_files: - split: test path: data/task_4_1.jsonl - config_name: task_4_2 data_files: - split: test path: data/task_4_2.jsonl - config_name: task_4_3 data_files: - split: test path: data/task_4_3.jsonl - config_name: task_5_1 data_files: - split: test path: data/task_5_1.jsonl - config_name: task_5_2 data_files: - split: test path: data/task_5_2.jsonl - config_name: task_5_3 data_files: - split: test path: data/task_5_3.jsonl - config_name: task_5_4 data_files: - split: test path: data/task_5_4.jsonl language: - vi license: apache-2.0 task_categories: - text-generation - question-answering - text-classification tags: - legal - vietnamese - benchmark - vlegal-bench - evaluation-only size_categories: - 10K<n<100K --- # VLegal — Vietnamese Legal Benchmark (Evaluation Only) Reformatted version of [VLegal-Bench](https://huggingface.co/datasets/CMC-OPENAI/VLegal-Bench) for per-task evaluation of Vietnamese Legal LLMs. > **This dataset is for EVALUATION ONLY. Do NOT use it for training.** > VLegal-Bench is a benchmark test set. Training on this data contaminates benchmark scores. ## Usage ```python from datasets import load_dataset # Load full benchmark (all 22 tasks) benchmark = load_dataset("datht/vlegal", split="test") # Load specific task task_1_1 = load_dataset("datht/vlegal", "task_1_1", split="test") task_4_2 = load_dataset("datht/vlegal", "task_4_2", split="test") ``` ## Tasks (22 total, 10,467 samples) ### Category 1: Recognition & Recall (3,520 samples) | Task | Name | Samples | Type | |------|------|---------|------| | 1.1 | Legal Entity Recognition | 748 | MC | | 1.2 | Legal Topic Classification | 683 | MC | | 1.3 | Legal Concept Recall | 300 | MC | | 1.4 | Article Recall | 968 | MC | | 1.5 | Legal Schema Recall | 821 | MC | ### Category 2: Understanding & Structuring (2,837 samples) | Task | Name | Samples | Type | |------|------|---------|------| | 2.1 | Relation Extraction | 253 | MC | | 2.2 | Legal Element Recognition | 300 | MC | | 2.3 | Legal Graph Structuring | 326 | MC | | 2.4 | Judgement Verification | 599 | MC | | 2.5 | User Intent Understanding | 1,359 | MC | ### Category 3: Reasoning & Inference (2,017 samples) | Task | Name | Samples | Type | |------|------|---------|------| | 3.1 | Article/Clause Prediction | 600 | MC | | 3.2 | Legal Court Decision Prediction | 600 | MC | | 3.3 | Multi-hop Graph Reasoning | 292 | MC | | 3.4 | Conflict & Consistency Detection | 166 | MC | | 3.5 | Penalty/Remedy Estimation | 359 | MC | ### Category 4: Interpretation & Generation (1,194 samples) | Task | Name | Samples | Type | |------|------|---------|------| | 4.1 | Legal Document Summarization | 396 | Gen | | 4.2 | Judicial Reasoning Generation | 300 | Gen | | 4.3 | Legal Opinion Generation | 498 | Gen | ### Category 5: Ethics, Fairness & Bias (899 samples) | Task | Name | Samples | Type | |------|------|---------|------| | 5.1 | Bias Detection | 249 | MC | | 5.2 | Privacy & Data Protection | 217 | MC | | 5.3 | Ethical Consistency Assessment | 199 | MC | | 5.4 | Unfair Contract Detection | 234 | MC | ## Evaluation Metrics - **Multiple Choice (MC)**: Accuracy - **Generation (Gen)**: ROUGE-L, BERTScore ## Source [CMC-OPENAI/VLegal-Bench](https://huggingface.co/datasets/CMC-OPENAI/VLegal-Bench) (arXiv:2512.14554)

dataset_info: 特征: - 名称: instruction（指令）数据类型: 字符串 - 名称: question（问题）数据类型: 字符串 - 名称: answers（回答）数据类型: 字符串 - 名称: ground_truth（基准真值）数据类型: 字符串 - 名称: _task（任务标识）数据类型: 字符串配置项: - 配置名称: default 数据文件: - 拆分集: test 路径: data/vlegal_bench_full.jsonl - 配置名称: task_1_1 数据文件: - 拆分集: test 路径: data/task_1_1.jsonl - 配置名称: task_1_2 数据文件: - 拆分集: test 路径: data/task_1_2.jsonl - 配置名称: task_1_3 数据文件: - 拆分集: test 路径: data/task_1_3.jsonl - 配置名称: task_1_4 数据文件: - 拆分集: test 路径: data/task_1_4.jsonl - 配置名称: task_1_5 数据文件: - 拆分集: test 路径: data/task_1_5.jsonl - 配置名称: task_2_1 数据文件: - 拆分集: test 路径: data/task_2_1.jsonl - 配置名称: task_2_2 数据文件: - 拆分集: test 路径: data/task_2_2.jsonl - 配置名称: task_2_3 数据文件: - 拆分集: test 路径: data/task_2_3.jsonl - 配置名称: task_2_4 数据文件: - 拆分集: test 路径: data/task_2_4.jsonl - 配置名称: task_2_5 数据文件: - 拆分集: test 路径: data/task_2_5.jsonl - 配置名称: task_3_1 数据文件: - 拆分集: test 路径: data/task_3_1.jsonl - 配置名称: task_3_2 数据文件: - 拆分集: test 路径: data/task_3_2.jsonl - 配置名称: task_3_3 数据文件: - 拆分集: test 路径: data/task_3_3.jsonl - 配置名称: task_3_4 数据文件: - 拆分集: test 路径: data/task_3_4.jsonl - 配置名称: task_3_5 数据文件: - 拆分集: test 路径: data/task_3_5.jsonl - 配置名称: task_4_1 数据文件: - 拆分集: test 路径: data/task_4_1.jsonl - 配置名称: task_4_2 数据文件: - 拆分集: test 路径: data/task_4_2.jsonl - 配置名称: task_4_3 数据文件: - 拆分集: test 路径: data/task_4_3.jsonl - 配置名称: task_5_1 数据文件: - 拆分集: test 路径: data/task_5_1.jsonl - 配置名称: task_5_2 数据文件: - 拆分集: test 路径: data/task_5_2.jsonl - 配置名称: task_5_3 数据文件: - 拆分集: test 路径: data/task_5_3.jsonl - 配置名称: task_5_4 数据文件: - 拆分集: test 路径: data/task_5_4.jsonl 语言: - vi（越南语）许可证: apache-2.0 任务类别: - 文本生成 - 问答 - 文本分类标签: - 法律 - 越南语 - 基准测试集 - vlegal-bench - 仅用于评估样本规模区间: - 10K<n<100K # VLegal——越南语法律基准测试集（仅用于评估）本数据集为[VLegal-Bench](https://huggingface.co/datasets/CMC-OPENAI/VLegal-Bench)的重构版本，用于对越南语法律大语言模型（Large Language Model，LLM）开展单任务评估。 > **本数据集仅可用于评估，严禁用于模型训练。** > VLegal-Bench属于基准测试集，若使用该数据集进行模型训练，会污染基准测试得分。 ## 使用方法 python from datasets import load_dataset # 加载完整基准测试集（包含全部22个任务） benchmark = load_dataset("datht/vlegal", split="test") # 加载指定单任务 task_1_1 = load_dataset("datht/vlegal", "task_1_1", split="test") task_4_2 = load_dataset("datht/vlegal", "task_4_2", split="test") ## 任务列表（共22项，总计10467条样本） ### 类别1：识别与召回（共3520条样本） | 任务编号 | 任务名称 | 样本数量 | 任务类型 | |------|------|---------|------| | 1.1 | 法律实体识别 | 748 | 多项选择（Multiple Choice，MC） | | 1.2 | 法律主题分类 | 683 | 多项选择（Multiple Choice，MC） | | 1.3 | 法律概念召回 | 300 | 多项选择（Multiple Choice，MC） | | 1.4 | 法条召回 | 968 | 多项选择（Multiple Choice，MC） | | 1.5 | 法律框架召回 | 821 | 多项选择（Multiple Choice，MC） | ### 类别2：理解与结构化（共2837条样本） | 任务编号 | 任务名称 | 样本数量 | 任务类型 | |------|------|---------|------| | 2.1 | 关系抽取 | 253 | 多项选择（Multiple Choice，MC） | | 2.2 | 法律要素识别 | 300 | 多项选择（Multiple Choice，MC） | | 2.3 | 法律图谱结构化 | 326 | 多项选择（Multiple Choice，MC） | | 2.4 | 判决验证 | 599 | 多项选择（Multiple Choice，MC） | | 2.5 | 用户意图理解 | 1359 | 多项选择（Multiple Choice，MC） | ### 类别3：推理与推断（共2017条样本） | 任务编号 | 任务名称 | 样本数量 | 任务类型 | |------|------|---------|------| | 3.1 | 法条/条款预测 | 600 | 多项选择（Multiple Choice，MC） | | 3.2 | 法院判决预测 | 600 | 多项选择（Multiple Choice，MC） | | 3.3 | 多跳图谱推理 | 292 | 多项选择（Multiple Choice，MC） | | 3.4 | 冲突与一致性检测 | 166 | 多项选择（Multiple Choice，MC） | | 3.5 | 刑罚/救济估算 | 359 | 多项选择（Multiple Choice，MC） | ### 类别4：解释与生成（共1194条样本） | 任务编号 | 任务名称 | 样本数量 | 任务类型 | |------|------|---------|------| | 4.1 | 法律文档摘要 | 396 | 生成式（Generation，Gen） | | 4.2 | 司法说理生成 | 300 | 生成式（Generation，Gen） | | 4.3 | 法律意见书生成 | 498 | 生成式（Generation，Gen） | ### 类别5：伦理、公平与偏见（共899条样本） | 任务编号 | 任务名称 | 样本数量 | 任务类型 | |------|------|---------|------| | 5.1 | 偏见检测 | 249 | 多项选择（Multiple Choice，MC） | | 5.2 | 隐私与数据保护 | 217 | 多项选择（Multiple Choice，MC） | | 5.3 | 伦理一致性评估 | 199 | 多项选择（Multiple Choice，MC） | | 5.4 | 不公平合同检测 | 234 | 多项选择（Multiple Choice，MC） | ## 评估指标 - **多项选择（Multiple Choice，MC）任务**：采用准确率（Accuracy）作为评估指标 - **生成式（Generation，Gen）任务**：采用ROUGE-L、BERTScore作为评估指标 ## 数据集来源 [CMC-OPENAI/VLegal-Bench](https://huggingface.co/datasets/CMC-OPENAI/VLegal-Bench) (arXiv:2512.14554)

提供机构：

datht

5,000+

优质数据集

54 个

任务类型

进入经典数据集