Hypernym-LIBre: A free Web-based Corpus for Hypernym Detection

NIAID Data Ecosystem2026-03-11 收录

下载链接：

https://zenodo.org/record/3662203

下载链接

链接失效反馈

官方服务：

资源简介：

The task of finding hypernyms from large text corpora is a fundamental problem in NLP. It provides a basis for the main-stream natural language problems in AI. In our paper, we introduce a free new web-based corpus for hypernym detection and we show that using this corpus we achieve similar results to the state-of-the-art pattern-based methods achieved by a well known corpus that is not freely available. The dataset provided here is the one we use in our paper and we provide it with an open license so others can apply different methods and techniques for hypernym detection. The dataset is a combination of UMBC corpus and the Wikipedia corpus. Its dependency parsed and POS-tagged versions are available at this DOI: 10.5281/zenodo.3689303 Contents: Hypernym-LIBre.zip 11.3GB compresssed, 32GB uncompressed raw text 288 files of ~110 MB each 10.5281/zenodo.3689303 PoS and dep annotated ~15GB compressed, 80GB uncompressed, 442 files of ~180MB each 10.5281/zenodo.3695237 hyponym-hypernym pairs extracted from Hypernym-LIBre using Hearst patterns

创建时间：

2020-03-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集