five

Cambridge Law Corpus, 1550-2023

收藏
DataCite Commons2024-02-19 更新2025-04-16 收录
下载链接:
http://reshare.ukdataservice.ac.uk/id/eprint/856927
下载链接
链接失效反馈
官方服务:
资源简介:
The Cambridge Law Corpus (CLC) is a corpus designed for legal AI research. It consists of over 250,000 court cases from the UK. Most cases are from the 21st century, but the corpus includes cases as old as the 16th century. Together with the corpus, annotations on case outcomes for 638 cases, done by legal experts, are provided. The Word files were cleaned and transformed into an XML format. PDF files were converted to textual form via optical character recognition (OCR). The resulting text files were then converted to the XML standard format. Because of legal and ethical considerations, the full Cambridge Law Corpus (CLC) is only available for research purposes under restrictions and available via Related Resources. A smaller dataset consisting of 15 selected cases from the CLC is available on the University of Cambridge Apollo Data Repository which can be accessed via Related Resources.
提供机构:
UK Data Service
创建时间:
2024-02-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作