WueNLP/SMPQA
收藏Hugging Face2025-01-10 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/WueNLP/SMPQA
下载链接
链接失效反馈官方服务:
资源简介:
SMPQA(合成多语言图表问答)数据集是一个包含合成条形图和饼图(使用不同语言的单词列表生成)以及关于这些图表的问题的数据集。该数据集旨在为评估模型在任意语言中的多语言OCR能力提供一种初步方法。数据集分为两个子任务:1. 地面任务,将问题中的文本标签定位到图像以回答是非问题(例如:“标签X的条形图是最高的吗?”)。2. 阅读任务,根据问题从图表中读取标签(例如:“红色条形的标签是什么?”)。目前支持11种语言,每种语言都有100个图表和相关的问题。
The SMPQA (Synthetic Multilingual Plot QA) dataset consists of synthetic bar plots and pie charts (generated using word lists of different languages) together with questions about those plots. The dataset aims to provide an initial way of evaluating multilingual OCR capabilities of models in arbitrary languages. There are two sub-tasks: 1. Grounding, which involves localizing text labels from the question to the image to answer yes/no questions (e.g., Is the bar with label X the tallest?). 2. Reading, which involves reading labels from the plot based on the question (e.g., What is the label of the red bar?). Currently, the dataset supports 11 languages, with 100 plots and associated questions for each language.
提供机构:
WueNLP



