five

judicialmind/india-acts

收藏
Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/judicialmind/india-acts
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个全面的印度立法PDF文件语料库,涵盖中央(议会)法案和邦/联邦属地法案,以英语和印地语提供,从公开的政府来源(主要是印度代码门户和个别邦立法机构网站)抓取和整合。该数据集旨在作为研究和AI训练资源,用于法律文件检索、法定问答、摘要、OCR/解析基准、多语言法律NLP和引用分析等任务。数据集包含12,102个PDF文件,总大小约21.7 GB,覆盖印度所有28个邦和8个联邦属地,年份范围从1836年到2025年。数据集由JudicialMind维护,并提供了详细的文件统计信息、加载方法、数据来源和方法论、已知限制、许可信息和引用细节。

A comprehensive corpus of Indian legislation in PDF form — covering both Central (Parliament) Acts and State / Union Territory Acts — in English and Hindi, scraped and consolidated from publicly available government sources (primarily the India Code portal and individual State legislature websites). This dataset is intended as a research and AI-training resource for tasks such as legal document retrieval, statutory question-answering, summarization, OCR/parsing benchmarks, multilingual legal NLP, and citation analysis. The dataset includes 12,102 PDFs totaling ~21.7 GB, with coverage of all 28 States and 8 Union Territories of India, spanning years from 1836 to 2025. The dataset is maintained by JudicialMind and includes detailed statistics on file counts, sizes, and year ranges for both Central and State Acts. The README also provides instructions on how to load the dataset, data provenance and methodology, known limitations, license information, and citation details.
提供机构:
judicialmind
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作