five

iperbole/multi_lmentry

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/iperbole/multi_lmentry
下载链接
链接失效反馈
官方服务:
资源简介:
Multi-LMentry是一个多语言基准数据集,旨在评估大型语言模型(LLMs)在九种语言(包括波斯语)的基础、初级任务上的表现。该数据集是LMentry(Efrat等人,2023)的多语言扩展,专注于对人类简单但对模型具有挑战性的任务。数据集通过母语者的手动创建,确保语言和文化的适宜性,而非依赖直接翻译。数据集按语言文件夹组织,每个文件夹包含特定任务的JSON文件。任务包括简单的句子构建、上下文词选择、字母推理等,部分任务具有语言特异性(如不适用押韵词的情况除外)。数据集主要用于LLMs的评估、跨语言比较(尤其是高资源和低资源语言之间)以及基础模型能力的诊断/单元测试,不适用于直接训练语言模型。

Multi-LMentry is a multilingual benchmark designed for evaluating large language models (LLMs) on fundamental, elementary-level tasks across nine languages, including Farsi. It is a multilingual extension of LMentry (Efrat et al., 2023), which evaluates LLMs on tasks that are trivial for humans but often challenging for models. The dataset was manually created by native speakers to ensure linguistic and cultural appropriateness rather than relying on direct translation. The dataset is organized by language folders, with each folder containing JSON files for specific tasks. Tasks include simple sentence construction, contextual word choice, alphabetic reasoning, etc., with some tasks being language-specific (e.g., rhyming words are excluded where not applicable). The dataset is intended for evaluation of LLMs, cross-lingual comparisons (especially between high-resource and low-resource languages), and diagnostics/unit tests of fundamental model abilities, and is not intended for training language models directly.
提供机构:
iperbole
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作