Mirna2Nageh/Egyptian_Criminal_Legal_Assistant_LLM_Fine_Tune_V1
收藏Hugging Face2026-02-17 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Mirna2Nageh/Egyptian_Criminal_Legal_Assistant_LLM_Fine_Tune_V1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-classification
- question-answering
- summarization
- sentence-similarity
- text-generation
language:
- ar
tags:
- legal
- agent
- code
- NLP
- GENAI
- Arabic
- test
pretty_name: Egyptian_Criminal_Legal_Assistant_LLM_Fine_Tune
size_categories:
- 1B<n<10B
---
# Egyptian Criminal Court Cases Dataset (Arabic)
## Overview
This dataset contains structured Arabic criminal court cases from the Egyptian judicial system.
It is designed to support AI research and development in:
- Criminal case analysis
- Legal Information Extraction
- Legal reasoning
- Weakness detection in criminal cases
- Retrieval-Augmented Generation (RAG)
The dataset focuses exclusively on criminal law cases under Egyptian jurisdiction.
---
## Legal Scope
This dataset includes:
- Criminal court rulings
- Criminal case facts (الوقائع)
- Legal reasoning
- Referenced criminal law articles
- Final judgments and penalties
It does NOT include civil or commercial cases.
---
## Dataset Structure
Each case record may include:
| Field Name | Description |
|------------|------------|
| `case_id` | Unique case identifier |
| `court` | Criminal court name |
| `year` | Case year |
| `facts` | Case facts (الوقائع) |
| `claims` | Parties’ claims |
| `evidence` | Evidence presented |
| `legal_articles` | Referenced Penal Code articles |
| `judgment` | Final criminal ruling |
| `reasoning` | Judicial reasoning |
---
## Intended Use
This dataset is suitable for:
- Arabic Legal Named Entity Recognition (NER)
- Criminal case structure extraction
- Legal argument mining
- Case similarity retrieval
- Criminal outcome analysis
- RAG-based criminal legal assistants
Example applications:
- Automatic extraction of criminal legal entities
- Criminal case similarity search
- AI-powered criminal case assistant
- Legal reasoning research
---
## Preprocessing
The dataset has been:
- Cleaned and normalized
- Structured into consistent sections
- Deduplicated
- Standardized in UTF-8 encoding
---
## Dataset Characteristics
- Language: Arabic
- Jurisdiction: Egypt
- Legal Domain: Criminal Law
- Data Type: Structured Legal Text
---
## Ethical & Privacy Considerations
- The dataset is derived from publicly available criminal court rulings.
- No confidential data is intentionally included.
- Users are responsible for ensuring ethical use of legal data.
---
## Research Value
Arabic criminal law datasets are extremely limited.
This dataset supports research in:
- Arabic Legal NLP
- Criminal law reasoning systems
- AI-based legal assistants
- End-to-end Egyptian Criminal Legal AI systems
许可证:Apache-2.0
任务类别:
- 文本分类
- 问答
- 摘要
- 句子相似度
- 文本生成
语言:
- 阿拉伯语
标签:
- 法律
- 智能体
- 代码
- 自然语言处理(Natural Language Processing,NLP)
- 生成式AI(Generative AI,GENAI)
- 阿拉伯语
- 测试
展示名称:埃及刑事法律助手大语言模型(LLM)微调数据集
数据规模:
- 10亿 < 数据量 < 100亿
# 埃及刑事法院案件数据集(阿拉伯语版)
## 数据集概述
本数据集收录源自埃及司法系统的结构化阿拉伯语刑事法院案件。
本数据集旨在支撑以下方向的人工智能研究与开发:
- 刑事案件分析
- 法律信息抽取
- 法律推理
- 刑事案件缺陷检测
- 检索增强生成(Retrieval-Augmented Generation,RAG)
本数据集仅涵盖埃及管辖范围内的刑事法律案件。
## 法律适用范围
本数据集包含以下内容:
- 刑事法院裁决书
- 刑事案件事实(الوقائع)
- 法律推理
- 所援引的刑事法律条文
- 最终判决与刑罚
本数据集不包含民事或商事案件。
## 数据集结构
每条案件记录可包含以下字段:
| 字段名 | 描述 |
|------------|------------|
| `case_id` | 案件唯一标识符 |
| `court` | 刑事法院名称 |
| `year` | 案件年份 |
| `facts` | 案件事实(الوقائع) |
| `claims` | 双方当事人的主张 |
| `evidence` | 提交的证据 |
| `legal_articles` | 所援引的刑法典条文 |
| `judgment` | 最终刑事裁决 |
| `reasoning` | 司法推理过程 |
## 适用场景
本数据集适用于以下方向:
- 阿拉伯语法律命名实体识别(Named Entity Recognition,NER)
- 刑事案件结构抽取
- 法律论点挖掘
- 案件相似度检索
- 刑事判决结果分析
- 基于检索增强生成的刑事法律助手
典型应用示例:
- 刑事法律实体的自动抽取
- 刑事案件相似度检索
- 人工智能驱动的刑事案件助手
- 法律推理相关研究
## 预处理流程
本数据集已完成以下预处理:
- 数据清洗与规范化
- 统一结构化分块
- 去重处理
- 采用UTF-8编码格式进行标准化
## 数据集特征
- 语言:阿拉伯语
- 管辖范围:埃及
- 法律领域:刑事法律
- 数据类型:结构化法律文本
## 伦理与隐私考量
- 本数据集源自公开可获取的刑事法院裁决书。
- 未故意收录任何涉密数据。
- 用户需确保对法律数据的使用符合伦理规范。
## 研究价值
阿拉伯语刑事法律数据集极为稀缺。
本数据集可支撑以下方向的研究:
- 阿拉伯语法律自然语言处理
- 刑事法律推理系统
- 人工智能驱动的法律助手
- 端到端埃及刑事法律人工智能系统
提供机构:
Mirna2Nageh



