five

XL-WSD-LLM: Extending XL-WSD to evaluate Large Language Models

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15007562
下载链接
链接失效反馈
官方服务:
资源简介:
This benchmark extends XL-WSD. Starting from XL-WSD, we build a set of prompts for evaluating Large Language Models (LLMs) in two settings. The first is a multiple-choice task, and the second is a generative task in which we assess the quality of the generated definition. The benchmark consists of three compressed archives. Two archives contain training and test data for each task and language, while another is dedicated to the output of several LLMs that we evaluate. Each dataset includes data split into two folders: FT and TT. FT contains data without machine translation, while TT contains data where missing glosses are automatically translated. More details are available in the pre-print article "Exploring the Word Sense Disambiguation Capabilities of Large Language Models,"  published on arXiv.org.
创建时间:
2025-03-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作