BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018)
收藏DataCite Commons2025-01-28 更新2025-04-17 收录
下载链接:
https://heidata.uni-heidelberg.de/citation?persistentId=doi:10.11588/DATA/V9CXPR
下载链接
链接失效反馈官方服务:
资源简介:
BPEmb is a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE).
In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages better
than alternative subword approaches, while requiring vastly fewer resources and no tokenization.
提供机构:
heiDATA
创建时间:
2019-02-06



