Aukandseal/IRSpubs_2020-2024
收藏Hugging Face2024-06-30 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/Aukandseal/IRSpubs_2020-2024
下载链接
链接失效反馈官方服务:
资源简介:
数据集包含2020年至2024年6月期间可自由获取的IRS税务出版物(不包括表格和说明),包括但不限于公告、通知、法规提案、程序、命令和私人信件。文本经过最小化处理,仅移除了HTML标签和元数据。数据集的特点包括长文本、复杂的文档内引用和高度专业化的主题,如税务、法律、法规和裁决等,适合用于长文档内容建模和MLM型任务。数据集遵循原始的IRS公共领域许可。
The dataset consists of freely available IRS tax publications (excluding forms and instructions) from 2020 to 2024 (June), including but not limited to Announcements, Notices, Regulation Proposals, Procedures, Orders, and Private Letters. There is minimal post-processing of text, except removal of html-like tags and metadata. The dataset consists of long-text with complex intra-doc references and highly specialized subject matter, such as tax, law, regulations, rulings, and more, suitable for modelling long-document content and MLM-type tasks. The data is licensed under the original IRS public domain licence.
提供机构:
Aukandseal



