juletxara/EHQA
收藏Hugging Face2024-07-15 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/juletxara/EHQA
下载链接
链接失效反馈官方服务:
资源简介:
EHQA数据集包含1500个关于巴斯克地区事实知识的多项选择题,旨在评估对该地区的事实知识。问题基于巴斯克维基百科的文章生成,并通过GPT-4 Turbo和Elia翻译工具进行处理。数据集的结构为JSON格式,每个问题包含问题、答案、选项、ID和来源URL。数据集的设计考虑了性别和地理平衡,并进行了手动审查以确保质量。此外,还进行了关于将巴斯克文化知识融入大型语言模型的实验,并比较了不同的技术。
EHQA is a dataset consisting of 1500 multiple-choice questions designed to evaluate factual knowledge about the Basque Country. The questions were created using articles about the Basque Country, taken from the Basque Wikipedia. The article selection was balanced in terms of gender of persons, types of items, and province of origin. Each multiple choice question contains attributes such as question, answer, choices, id, and url. The dataset also includes experiments on adapting Large Language Models (LLMs) to the Basque culture using techniques like continual pre-training, knowledge editing, and RAG.
提供机构:
juletxara



