five

A Century of Evolution in the Complexity of the United States Legal Code

收藏
Figshare2025-10-22 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/A_Century_of_Evolution_in_the_Complexity_of_the_United_States_Legal_Code/29540039/4
下载链接
链接失效反馈
官方服务:
资源简介:
We leverage <b>OCR</b> and <b>Generative AI</b> techniques to recover and clean printed historical editions of the Code. This enables computational analysis of federal law even in periods before web-based digital access. The processing pipeline includes:📄 <b>Contents of U.S. Code</b>: Word counts, unique word counts, entropy, scaling exponents, etc.🌲 <b>Hierarchical Structure</b>: Subtitle → Part → Chapter → Section → Subsection...🔗 <b>Cross-Reference Relationships</b>: Title-to-title citation relationshipsDue to repository size constraints, this GitHub includes:🔍 A sample OCR text page (<code>ocr_processing_gemini</code>) for demonstration🌐 Web-based U.S. Code text from 1994 for structural parsing (<code>Data Set 2</code>)
提供机构:
B. West, Geoffrey; Youn, Hyejin; P. Kempes, Christopher; Yoon, Jisung; Jeong, Dawoon; Holehouse, James
创建时间:
2025-10-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作