five

UK-LEX Dataset - Part of Chalkidis and Søgaard (2022)

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6355464
下载链接
链接失效反馈
官方服务:
资源简介:
The UK-LEX dataset is part of the work "Ilias Chalkidis and Anders Søgaard. Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting. 2022. In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, Ireland." Details: United Kingdom (UK) legislation is publicly available as part of the United Kingdom's National Archives (https://www.legislation.gov.uk). Most of the laws have been categorized in thematic categories (e.g., health-care, finance, education, transportation, planning) that are presented in the document preamble and are used for archival indexing purposes. We release a new dataset, which comprises 36.5k UK laws (documents). The dataset is chronologically split in training (20k, 1975--2002), development (8.5k, 2002--2008), test (8.5k, 2008--2018) subsets. We manually extract and cluster the topics to supports two different label granularities,  comprising 18, and 69 topics (labels), respectively. Data Files: uk-lex18.jsonl: The dataset where documents are annotated with 18 different topics (labels). uk-lex69.jsonl: The dataset where documents are annotated with 69 different topics (labels).
创建时间:
2022-03-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作