five

EB-KG: Knowlege Graph of the first 8 eiditions Encyclopaedia Brittanica (1768-1860)

收藏
Mendeley Data2024-03-27 更新2024-06-29 收录
下载链接:
https://zenodo.org/record/6673990
下载链接
链接失效反馈
官方服务:
资源简介:
This Knowlege Graph represents the information of the first eight editions of Encyclopaedia Brittanica (years: 1768 to 1860) in RDF (ttl format). The raw dataset is provided by the NLS in this link , and it comprises of eight editions and a total of 195 volumes with a total size of 44GB. It uses two XMLs schemas: METS for descriptive, structural, technical and administrative metadata (Title, Author, Publisher, etc); and ALTO for encoding the OCR text of a page. In this work, we have extracted the information from METS and ALTO XMLS using defoe tool, and created a new Knowlege Graph called EB-KG. The EB-KG uses the EB Ontolgy, to represent the information extracted. The EB-KG contains 1,638,239 RDF triples. It has information from 8 editions. Each edition can have several Volumes, references to Books, Supplements; it also has an Editor and a Publisher, which can be a Person or an Organization. A Volume has several Pages, which can contain several Terms. And a Term can be either a Topic (a term described across several pages, often combining text, pictures, and tables.) or an Article (a description of the term in one- or two-paragraph long text (similar to an entry in a dictionary)). The data model of the EB-KG can be found here. The original ALTO files do not indicate the start and end of each EB term, the first part of our work involved the automated extraction of all terms (along with their metadata) across editions, so they can be analysed independently without the surrounding text.
创建时间:
2023-06-28
二维码
社区交流群
二维码
科研交流群
商业服务