five

horelulus/ID_REG_QA_Small

收藏
Hugging Face2026-03-28 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/horelulus/ID_REG_QA_Small
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 --- # 🧾 Indonesian Legal QA Dataset This repository contains a **question-answer (QA) dataset** generated from parsed Indonesian regulations, focusing on **legal quoting and comprehension**. Designed to facilitate legal-aware LLMs, the dataset provides direct QA mappings to individual articles for contextual understanding and reference. --- ## 📌 Dataset Highlights * **Source**: Generated from the [ID\_REG\_Parsed](https://huggingface.co/datasets/Azzindani/ID_REG_Parsed) repository * **Format**: QA pairs based on individual articles (no chunking) * **Scale**: Augmented by applying 10 QA templates across suitable regulation entries * **Filtering**: Programmatic filtering removes redundant or overly broad article explanations * **Target Use**: Train/test LLMs for **regulation comprehension**, **legal quoting**, and **document-level QA** --- ## ⚙️ Pipeline Overview * **Environment**: Executed in a single Jupyter Notebook on **Kaggle Cloud** * **Data Flow**: 1. **Pull** parsed articles from `ID_REG_Parsed` 2. Filter and refine results for clarity and legal context 3. Apply **template-driven QA generation** (10 variations) 4. **Push** QA dataset directly to this repository * **Performance**: * Completed in \~20 minutes using Kaggle GPU resources * Cloud-to-cloud transfer without local storage dependency --- ## 🧠 Use Cases * Fine-tuning LLMs for **legal question answering** * Benchmarks for **article referencing and quoting** * Few-shot prompting for legal search assistants * Legal text evaluation with grounded answers --- ## ⚠️ Disclaimer This dataset is intended for **research and development** only. QA pairs are generated synthetically from publicly available legal text and may not reflect official interpretations. --- ## 🙏 Acknowledgments * **[Hugging Face](https://huggingface.co/)** for hosting open datasets * **[Kaggle](https://www.kaggle.com/)** for compute and cloud-to-cloud capabilities ---
提供机构:
horelulus
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作