MMFS directories for LargeRDFBench datasets
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13954799
下载链接
链接失效反馈官方服务:
资源简介:
The file mmfs.tar.zst contains all cleaned (RDF 1.1) datasets from the LargeRDFBench, encoded with the Mapped-Memory Friendly Store (MMFS) format, which is used in experiments which code, scripts and results are also available as a dataset in Zenodo: 10.5281/zenodo.13960678.
Uncompressed, the 13 directories will cost 36.317 GiB in disk space. The list of directories inside mmfs.tar.zst is:
Affymetrix
ChEBI
DBPedia-Subset
DrugBank
GeoNames
Jamendo
KEGG
LinkedTCGA-A
LinkedTCGA-E
LinkedTCGA-M
LMDB
NYT
SWDFood
Each MMFS directory contains 4 files:
shared: suffixes and prefixes that are shared between multiple RDF terms;
strings: The dictionary that maps the N-Triples representation of each RDF term in the graph to a unique 64-bit sequential identifier. This dictionary also decomposes the N-Triples strings into a local part, stored at the end of this file, and a shared prefix/suffix, stored at the shared file;
lexical: This is a small list L of bitsets, where L[hash(t) % L.length] is the set of literal suffixes for which there MAY be a term u such that str(u) = str(t).
spo: This is a triples index which stores a sorted (by ID) list of predicate and object pairs for each subject;
pso: Similar to spo, but stores subject-object pairs for a given predicate;
ops: Similar to spo, but stores predicate-subject pairs given an object.
创建时间:
2025-02-01



