aktaint/nepal-legal-corpus
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/aktaint/nepal-legal-corpus
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- ne
- en
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: act_name
dtype: large_string
- name: filename
dtype: large_string
- name: text_ne
dtype: large_string
- name: source
dtype: large_string
- name: extraction_method
dtype: large_string
splits:
- name: train
num_bytes: 1261984
num_examples: 21
download_size: 378433
dataset_size: 1261984
---
---
language: ne
tags:
- legal
- nepali
- law
- rag
- government
license: cc0-1.0
---
# Nepal Legal Corpus (v0.1 Seed)
**Initial public release** — Official Nepali law books from Law Books Management Board (lbmb.gov.np).
### Included Volumes (v0.1)
- नेपाल ऐन सङ्ग्रह खण्ड १७, २०८२
- नेपाल ऐन सङ्ग्रह खण्ड ५, २०८२
- नेपाल नियम सङ्ग्रह खण्ड ११, २०८२
- नेपाल ऐन सङ्ग्रह खण्ड ९, २०८२
- नेपाल ऐन सङ्ग्रह खण्ड ६(क), २०८१
### Structure
- `raw_pdfs/` — Original PDFs (via Git LFS if large)
- `extracted_text/ne/` — Raw extracted Nepali text
**Goal**: Build a high-quality open corpus for Nepali legal AI, RAG, fine-tuning, and education.
Complementary to government/UNDP digitization efforts. Contributions welcome!
Next steps: Better extraction with vision models, translation layer, structured Parquet format.
提供机构:
aktaint



