orai-nlp/EHQA
收藏Hugging Face2024-07-10 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/orai-nlp/EHQA
下载链接
链接失效反馈官方服务:
资源简介:
EHQA是一个包含1500个多项选择题的数据集,旨在评估关于巴斯克地区的事实知识。问题基于巴斯克维基百科的文章生成,并通过Elia翻译和GPT-4 Turbo生成问题。数据集的结构为JSON格式,每个问题包含问题、答案、选项、ID和源文档URL。数据集在性别和地理来源上进行了平衡,并进行了多项实验以评估大型语言模型对巴斯克文化的适应性。
EHQA is a dataset consisting of 1500 multiple-choice questions designed to evaluate factual knowledge about the Basque Country. The questions were created using articles about the Basque Country, taken from the Basque Wikipedia. The article selection was balanced in terms of gender of persons, types of items, and province of origin. The articles were translated into English with Elia, and a multiple-choice question was generated from each article using GPT-4 Turbo. All questions and answers have been manually reviewed. The corpus is released in JSON format. Each multiple choice question contains the following attributes: question, answer, choices, id, and url. The dataset also includes information on culture adaptation experiments, using the Harness framework to evaluate the adaptation of large language models to Basque culture.
提供机构:
orai-nlp



