LadiesDebating-KG: A Knowlege Graph for representing the "Edinburgh Ladies' Debating Society Digital Collection" (1865 - 1880)
收藏Mendeley Data2024-05-10 更新2024-06-29 收录
下载链接:
https://zenodo.org/records/6686596
下载链接
链接失效反馈官方服务:
资源简介:
This Knowlege Graph represents the information of the "Edinburgh Ladies’ Debating Society" (years: 1865 - 1880) collection in RDF (ttl format). This collection consists of the complete runs of two Edinburgh journals, ‘The Attempt’ (10 volumes, 1865-74) and its successor ‘The Ladies’ Edinburgh Magazine’ (6 volumes, 1875-80). These publications were produced by a leading Edinburgh women’s club, known during the period as the Edinburgh Essay Society or the Ladies’ Edinburgh Essay Society, but subsequently as the Ladies’ Edinburgh Debating Society. The Society existed from 1865 to 1935. The raw dataset is provided by the NLS in this link. As other NLS data collections, they are originally provided using two XMLs schemas: METS for descriptive, structural, technical and administrative metadata (Title, Author, Publisher, etc); and ALTO for encoding the OCR text of a page. In this work, we have extracted the information from METS and ALTO XMLS using defoe tool and developed a new information extraction defoe query , and created a new Knowlege Graph called LadiesDebating-KG. The LadiesDebating-KG uses the NLS Ontology to represent the information extracted. Furthermore, during the information extraction phase, we have employed several techniques to mitigate two common OCR errors: long-S and the line-break hyphenation. The LadiesDebating-KG contains 38,279 RDF triples. It has information from 2 series and 16 volumes: 'The attempt' serie has 10 volumes and 'The Ladies' serie has 6 volumes . Each serie has an Editor, mmsid, Shelf-Locator, publication year, etc. A Volume has several Pages, with text in them. The data model of the LadiesDebating-KG can be found here.
本知识图谱(Knowledge Graph)以资源描述框架(Resource Description Framework,RDF)的ttl格式,呈现爱丁堡女性辩论协会(Edinburgh Ladies’ Debating Society,1865-1880年)馆藏数据集的相关信息。该馆藏收录两份爱丁堡期刊的完整连载:《尝试者》(10卷,1865-1874年)及其后续刊物《爱丁堡女性杂志》(6卷,1875-1880年)。上述出版物均由彼时爱丁堡颇具影响力的女性俱乐部出版,该俱乐部在当时被称为爱丁堡随笔协会或爱丁堡女性随笔协会,后续更名为爱丁堡女性辩论协会。该协会存续周期为1865年至1935年。原始数据集由苏格兰国家图书馆(National Library of Scotland,NLS)通过此链接提供。与其他NLS数据集类似,其原始数据基于两类XML模式:一类是用于描述、结构化、技术及管理元数据(包括题名、作者、出版者等)的METS,另一类是用于编码页面光学字符识别(Optical Character Recognition,OCR)文本的ALTO。本研究通过defoe工具从METS与ALTO XML文件中提取信息,并开发了全新的defoe信息提取查询脚本,最终构建了名为LadiesDebating-KG的知识图谱。LadiesDebating-KG采用NLS本体对所提取的信息进行表征。此外,在信息提取阶段,我们采用多种技术修正了两类常见OCR错误:长S(long-S)与换行连字符问题。LadiesDebating-KG共包含38279条RDF三元组,涵盖2个系列共16卷文献的相关信息:其中《尝试者》系列含10卷,《爱丁堡女性杂志》系列含6卷。每个系列均包含编者、mmsid、馆藏架位号(Shelf-Locator)、出版年份等元数据。每一卷文献均包含若干带有文本内容的页面。LadiesDebating-KG的数据模型可在此处查阅。
创建时间:
2023-06-28



