iperbole/multi_lmentry
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/iperbole/multi_lmentry
下载链接
链接失效反馈官方服务:
资源简介:
Multi-LMentry是一个多语言基准数据集,旨在评估大型语言模型(LLMs)在九种语言(包括波斯语)的基础、初级任务上的表现。该数据集是LMentry(Efrat等人,2023)的多语言扩展,专注于对人类简单但对模型具有挑战性的任务。数据集通过母语者的手动创建,确保语言和文化的适宜性,而非依赖直接翻译。数据集按语言文件夹组织,每个文件夹包含特定任务的JSON文件。任务包括简单的句子构建、上下文词选择、字母推理等,部分任务具有语言特异性(如不适用押韵词的情况除外)。数据集主要用于LLMs的评估、跨语言比较(尤其是高资源和低资源语言之间)以及基础模型能力的诊断/单元测试,不适用于直接训练语言模型。
Multi-LMentry is a multilingual benchmark designed for evaluating large language models (LLMs) on fundamental, elementary-level tasks across nine languages, including Farsi. It is a multilingual extension of LMentry (Efrat et al., 2023), which evaluates LLMs on tasks that are trivial for humans but often challenging for models. The dataset was manually created by native speakers to ensure linguistic and cultural appropriateness rather than relying on direct translation. The dataset is organized by language folders, with each folder containing JSON files for specific tasks. Tasks include simple sentence construction, contextual word choice, alphabetic reasoning, etc., with some tasks being language-specific (e.g., rhyming words are excluded where not applicable). The dataset is intended for evaluation of LLMs, cross-lingual comparisons (especially between high-resource and low-resource languages), and diagnostics/unit tests of fundamental model abilities, and is not intended for training language models directly.
提供机构:
iperbole



