HiT: Language Models as Hierarchy Encoders
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10511042
下载链接
链接失效反馈官方服务:
资源简介:
About
Datasets for training and evaluating the Hierarchy Transformer encoders (HiTs) proposed in the paper titled: "Language Models as Hierarchy Encoders".
Files with multi suffix corresponds to Multi-hop Inference evaluaiton.
Files with mixed suffix corresponds to Mixed-hop Prediction (and its transfer setting) evaluation.
schemaorg, foodon, and doid are only involved in the transfer evaluation, but the datasets here for foodon and doid also give their training sets (see explanation in the paper for why we opted not to generate a trainning set for schemaorg).
The previous version of this dataset collection has been marked deprecated because it seems that it contains broken files for snomed.
Huggingface Datasets
We offer a convenient Huggingface Datasets entry, enabling users to load data directly using the load_dataset method. The datasets are available in formats of either entity triplets or labelled entity pairs. Please note that in this way, the original entity IDs are not retained. To map entities back to their original hierarchies, refer to this Zenodo release.
Citation
The relevant paper has been accepted at NeurIPS 2024 (to appear).
Links
GitHub repository: https://github.com/KRR-Oxford/HierarchyTransformers
Models and Datasets on Huggingface Hub: https://huggingface.co/Hierarchy-Transformers
Arxiv preprint: https://arxiv.org/abs/2401.11374
Contact
Yuan He (yuan.he(at)cs.ox.ac.uk)
创建时间:
2025-03-26



