XL-WSD-LLM: Extending XL-WSD to evaluate Large Language Models
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15007562
下载链接
链接失效反馈官方服务:
资源简介:
This benchmark extends XL-WSD. Starting from XL-WSD, we build a set of prompts for evaluating Large Language Models (LLMs) in two settings. The first is a multiple-choice task, and the second is a generative task in which we assess the quality of the generated definition.
The benchmark consists of three compressed archives. Two archives contain training and test data for each task and language, while another is dedicated to the output of several LLMs that we evaluate. Each dataset includes data split into two folders: FT and TT. FT contains data without machine translation, while TT contains data where missing glosses are automatically translated.
More details are available in the pre-print article "Exploring the Word Sense Disambiguation Capabilities of Large Language Models," published on arXiv.org.
创建时间:
2025-03-11



