BLUEX
收藏arXiv2023-07-12 更新2024-06-21 收录
下载链接:
https://github.com/Portuguese-Benchmark-Datasets/BLUEX
下载链接
链接失效反馈官方服务:
资源简介:
BLUEX数据集是由金边州立大学和USP创建的,包含超过1000道来自巴西顶尖大学入学考试的多项选择题。该数据集覆盖了数学、物理、化学等多个高中科目,并特别标注了图像位置,以支持多模态语言理解研究。数据集的创建过程涉及自动提取问题和手动校正,旨在评估和提升葡萄牙语自然语言处理模型的性能,特别是在标准化测试中的应用。
The BLUEX dataset was developed by Phnom Penh State University and the University of São Paulo (USP), containing over 1,000 multiple-choice questions sourced from the entrance examinations of top Brazilian universities. This dataset covers multiple high school subjects including mathematics, physics, chemistry and more, with specially annotated image positions to support multimodal language understanding research. The dataset creation process involves automatic question extraction and manual correction, aiming to evaluate and enhance the performance of Portuguese natural language processing models, especially for their applications in standardized tests.
提供机构:
金边州立大学 (UNICAMP)
创建时间:
2023-07-12



