Hypernym-LIBre: A free Web-based Corpus for Hypernym Detection
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3662203
下载链接
链接失效反馈官方服务:
资源简介:
The task of finding hypernyms from large text corpora is a fundamental problem in NLP. It provides a basis for the main-stream natural language problems in AI. In our paper, we introduce a free new web-based corpus for hypernym detection and we show that using this corpus we achieve similar results to the state-of-the-art pattern-based methods achieved by a well known corpus that is not freely available. The dataset provided here is the one we use in our paper and we provide it with an open license so others can apply different methods and techniques for hypernym detection.
The dataset is a combination of UMBC corpus and the Wikipedia corpus. Its dependency parsed and POS-tagged versions are available at this DOI: 10.5281/zenodo.3689303
Contents:
Hypernym-LIBre.zip 11.3GB compresssed, 32GB uncompressed raw text
288 files of ~110 MB each
10.5281/zenodo.3689303
PoS and dep annotated
~15GB compressed, 80GB uncompressed, 442 files of ~180MB each
10.5281/zenodo.3695237
hyponym-hypernym pairs extracted from Hypernym-LIBre using Hearst patterns
创建时间:
2020-03-03



