Documents first indexed in the SLUB catalogue in 2022

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/7953810

下载链接

链接失效反馈

官方服务：

资源简介：

The data set is available as a compressed, line-delimited JSON file and contains the 7,127,497 documents first indexed in the SLUB catalogue in 2022 with the fields id (document identifier) and first_indexed (timestamp). The id consists of a creator id (optional), source id and a record id according to the scheme [{creator_id}-]{source_id}-{record_id}. If the record id contains characters that are unsuitable for URLs, it is base64-encoded without any padding. The first_indexed field, which is part of VuFind's Solr index schema, was determined using a Django application that regularly monitors the Solr cores of the SLUB catalogue. The respective document sets from different points in time are compared with each other in order to determine new documents, i.e. document identifiers. If a new document is found, first_indexed is assigned the timestamp from the last_indexed field. The field obtained in this way serves as the basis for creating a list of new titles. However, it should be noted that neither all first indexed documents necessarily represent new titles, nor do the documents contained in this data set still have to be in the catalogue. To check whether a document is currently in the SLUB catalogue, its detailed view can be retrieved using the following URL scheme: https://katalog.slub-dresden.de/id/{id}. Example: https://katalog.slub-dresden.de/id/0-173837243X.

创建时间：

2024-02-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集