five

horelulus/ID_REG_KG_2511

收藏
Hugging Face2026-03-28 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/horelulus/ID_REG_KG_2511
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-generation - question-answering - feature-extraction - sentence-similarity language: - id tags: - legal - indonesia - knowledge-graph - rag - production-ready - ontology - regulation - embeddings - tfidf - hybrid-search - legal-tech --- # ID_REG_KG_2511: Indonesian Legal Regulation Knowledge Graph ID_REG_KG_2511 is a high-quality, structured dataset specifically designed to represent Indonesian laws and regulations in a **Knowledge Graph (KG)** format. This dataset transforms flat legal text into a network of entities and relationships, enabling advanced Retrieval-Augmented Generation (RAG) and complex legal reasoning. ## Dataset Description The dataset focuses on the structural hierarchy and inter-connectivity of Indonesian regulations (Undang-Undang, Peraturan Pemerintah, etc.). By mapping "Articles" (Pasal), "Chapters" (Bab), and "Entities" (Legal subjects/objects), this repository provides a foundation for "GraphRAG" applications in the Indonesian legal-tech space. ### Key Features - **Structured Ontology**: Contains defined relationships between legal nodes (e.g., `MEMPUNYAI_ISI`, `MENGATUR_TENTANG`, `DIUBAH_OLEH`). - **Granular Units**: Data is broken down into the smallest logical units of law to ensure high precision in similarity searches. - **Production-Ready**: Formatted for immediate ingestion into Graph Databases (like Neo4j) or Vector Databases for hybrid search. ## Data Structure Each entry in the dataset typically represents a triplet or a structured node containing: - **Head**: The source entity (e.g., "Pasal 1"). - **Relation**: The semantic link (e.g., "BAGIAN_DARI"). - **Tail**: The target entity or content (e.g., "BAB I Ketentuan Umum"). - **Metadata**: Information regarding the specific regulation (ID, Year, Type). ## Use Cases 1. **Legal Question Answering**: Enhancing LLMs to provide citations by traversing the graph. 2. **Regulatory Compliance**: Mapping how different regulations interact or contradict one another. 3. **Hybrid Search**: Combining TF-IDF and Vector Embeddings with Graph context to find the most relevant legal precedents. 4. **Feature Extraction**: Identifying legal obligations and sanctions automatically. ## Language The dataset is entirely in **Indonesian (id)**, preserving the formal legal terminology used in official government gazettes. ## License This dataset is licensed under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/). You are free to share and adapt the material for any purpose, even commercially, as long as appropriate credit is given. ## Citation If you use this dataset in your research or production environment, please cite it as: ```text Azzindani. (2025). ID_REG_KG_2511: Indonesian Legal Regulation Knowledge Graph. Hugging Face Datasets. ``` ---
提供机构:
horelulus
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作