five

Machine-Readable Vocabulary Files of the "Alter Realkatalog" (ARK) of Berlin State Library (SBB)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13301019
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains two versions of vocabulary files of the ARK (Alter Realkatalog) in .tsv and .ttl format used for training models for automatic subject indexing with the modular Annif tool. As the ARK is a historical classification system which has been used to describe historical works in the Staatsbibliothek zu Berlin – Berlin State Library’s collections up to 1955, this dataset has been created for generating automatic indexing suggestions for historical texts which have not yet been manually classified with the help of the ARK (for a detailed description of the ARK, see also Metadata of the "Alter Realkatalog" (ARK) of Berlin State Library (SBB). Together with specific corpus training data, these vocabulary files serve as input to Annif, with which the corresponding models on Hugging Face at the Staatsbibliothek zu Berlin – Preußischer Kulturbesitz community have been created. Associated corpus training data have been extracted from the Metadata of the "Alter Realkatalog" (ARK) (title data).
创建时间:
2024-08-14
二维码
社区交流群
二维码
科研交流群
商业服务