five

caro-holt/MultiQ

收藏
Hugging Face2024-03-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/caro-holt/MultiQ
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 language: - tl - sm - mk - gu - fi - mn - bm - ta - ur - hy - nl - tk - en - bg - gd - pt - ko - ga - eu - sv - bs - co - fr - gn - ro - it - dv - ku - ak - eo - zu - id - te - sl - lv - pa - ru - si - ee - yi - ny - az - sw - hi - mt - sr - hr - ka - ug - tt - lg - kn - fy - kk - ca - lb - jv - et - la - tr - ps - km - zh - uk - as - he - yo - sq - da - gl - vi - ay - is - ln - mr - st - xh - cs - ky - ml - ht - mi - so - uz - el - ti - be - cy - am - ig - or - fa - ms - su - de - lo - ha - ts - om - ar - my - es - qu - 'no' - th - sa - mg - pl - sd - sk - bn - rw - af - ne - lt - tg - ja - sn - hu size_categories: - 10K<n<100K task_categories: - question-answering --- # Dataset Card for MultiQ This is the dataset corresponding to the paper "Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ". It is a silver standard benchmark that can be used to evaluate the basic multilingual capabilities of LLMs. It contains 200 open ended questions automatically translated into 137 typologically diverse languages. - **Curated by:** Carolin Holtermann, Paul Röttger, Timm Dill, Anne Lauscher - **Language(s) (NLP):** 137 diverse languages described in detail in our paper - **License:** CC-BY-4.0 License ### Dataset Sources - **Repository:** [Github](https://github.com/paul-rottger/multiq) - **Paper:** TBD
提供机构:
caro-holt
原始信息汇总

数据集概述

基本信息

  • 数据集名称: MultiQ
  • 相关论文: "Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ"
  • 数据集用途: 用于评估大型语言模型的基本多语言能力

数据集内容

  • 问题数量: 200个开放式问题
  • 语言种类: 137种不同类型的语言

数据集详情

  • 语言列表: 137种语言,详细描述见论文
  • 许可证: CC-BY-4.0

数据集来源

  • 代码仓库: Github
  • 相关论文: 待定
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作