horelulus/ID_REG_KG_2511
收藏Hugging Face2026-03-28 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/horelulus/ID_REG_KG_2511
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-generation
- question-answering
- feature-extraction
- sentence-similarity
language:
- id
tags:
- legal
- indonesia
- knowledge-graph
- rag
- production-ready
- ontology
- regulation
- embeddings
- tfidf
- hybrid-search
- legal-tech
---
# ID_REG_KG_2511: Indonesian Legal Regulation Knowledge Graph
ID_REG_KG_2511 is a high-quality, structured dataset specifically designed to represent Indonesian laws and regulations in a **Knowledge Graph (KG)** format. This dataset transforms flat legal text into a network of entities and relationships, enabling advanced Retrieval-Augmented Generation (RAG) and complex legal reasoning.
## Dataset Description
The dataset focuses on the structural hierarchy and inter-connectivity of Indonesian regulations (Undang-Undang, Peraturan Pemerintah, etc.). By mapping "Articles" (Pasal), "Chapters" (Bab), and "Entities" (Legal subjects/objects), this repository provides a foundation for "GraphRAG" applications in the Indonesian legal-tech space.
### Key Features
- **Structured Ontology**: Contains defined relationships between legal nodes (e.g., `MEMPUNYAI_ISI`, `MENGATUR_TENTANG`, `DIUBAH_OLEH`).
- **Granular Units**: Data is broken down into the smallest logical units of law to ensure high precision in similarity searches.
- **Production-Ready**: Formatted for immediate ingestion into Graph Databases (like Neo4j) or Vector Databases for hybrid search.
## Data Structure
Each entry in the dataset typically represents a triplet or a structured node containing:
- **Head**: The source entity (e.g., "Pasal 1").
- **Relation**: The semantic link (e.g., "BAGIAN_DARI").
- **Tail**: The target entity or content (e.g., "BAB I Ketentuan Umum").
- **Metadata**: Information regarding the specific regulation (ID, Year, Type).
## Use Cases
1. **Legal Question Answering**: Enhancing LLMs to provide citations by traversing the graph.
2. **Regulatory Compliance**: Mapping how different regulations interact or contradict one another.
3. **Hybrid Search**: Combining TF-IDF and Vector Embeddings with Graph context to find the most relevant legal precedents.
4. **Feature Extraction**: Identifying legal obligations and sanctions automatically.
## Language
The dataset is entirely in **Indonesian (id)**, preserving the formal legal terminology used in official government gazettes.
## License
This dataset is licensed under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/). You are free to share and adapt the material for any purpose, even commercially, as long as appropriate credit is given.
## Citation
If you use this dataset in your research or production environment, please cite it as:
```text
Azzindani. (2025). ID_REG_KG_2511: Indonesian Legal Regulation Knowledge Graph. Hugging Face Datasets.
```
---
提供机构:
horelulus



