GQA-ru

Name: GQA-ru
Creator: maas
Published: 2025-12-05 16:44:12
License: 暂无描述

魔搭社区2025-12-05 更新2025-08-02 收录

下载链接：

https://modelscope.cn/datasets/deepvk/GQA-ru

下载链接

链接失效反馈

官方服务：

资源简介：

# GQA-ru This is a translated version of original [GQA](https://cs.stanford.edu/people/dorarad/gqa/about.html) dataset and stored in format supported for [`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval) pipeline. For this dataset, we: 1. Translate the original one with `gpt-4-turbo` 2. Filter out unsuccessful translations, i.e. where the model protection was triggered 3. Manually validate most common errors ## Dataset Structure Dataset includes both train and test splits translated from original `train_balanced` and `testdev_balanced`. Train split includes 27519 images with 40000 questions to them and test split contains 398 images with 12216 different question to them. Storage format is similar to [`lmms-lab/GQA`](https://huggingface.co/datasets/lmms-lab/GQA), key fields: * `id`: ID of a question * `imageId`: ID of an image (images stored in a separate table) * `question`: text of a question * `answer`: one word answer * `fullAnswer`: detailed answer ## Usage The easiest way to evaluate model on `GQA-ru` is through [`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval) For example, to evaluate [`deepvk/llava-saiga-8b`](https://huggingface.co/deepvk/llava-saiga-8b): ```bash accelerate launch -m lmms_eval --model llava_hf \ --model_args pretrained="deepvk/llava-saiga-8b" \ --tasks gqa-ru --batch_size 1 \ --log_samples --log_samples_suffix llava-saiga-8b --output_path ./logs/ ``` This would print a table with a result, the main metric for this task is `ExactMatch` for one word answer -- whether generated word is completely similar to ground truth. ## Citation ``` @inproceedings{hudson2019gqa, title={Gqa: A new dataset for real-world visual reasoning and compositional question answering}, author={Hudson, Drew A and Manning, Christopher D}, booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition}, pages={6700--6709}, year={2019} } ``` ``` @misc{deepvk2024gqa_ru, title={GQA-ru}, author={Belopolskih, Daniil and Spirin, Egor}, url={https://huggingface.co/datasets/deepvk/GQA-ru}, publisher={Hugging Face} year={2024}, } ```

# GQA-ru 本数据集为原始[GQA](https://cs.stanford.edu/people/dorarad/gqa/about.html)的译制版，采用适配[`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval)流水线的格式存储。针对本数据集，我们完成了以下工作： 1. 使用`gpt-4-turbo`对原始数据集进行翻译 2. 过滤掉翻译失败的样本，即触发模型内容保护机制的翻译结果 3. 对常见翻译错误进行人工校验 ## 数据集结构数据集包含从原始`train_balanced`与`testdev_balanced`拆分转换而来的训练集与测试集。训练集涵盖27519张图像及对应40000个问题，测试集包含398张图像及12216个不同的问题。其存储格式与[`lmms-lab/GQA`](https://huggingface.co/datasets/lmms-lab/GQA)一致，核心字段如下： * `id`：问题唯一标识符 * `imageId`：图像ID（图像单独存储于独立数据表中） * `question`：问题文本 * `answer`：单字词标准答案 * `fullAnswer`：详细标准答案 ## 使用方式通过[`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval)是在GQA-ru上评估模型的最简途径。例如，评估[`deepvk/llava-saiga-8b`](https://huggingface.co/deepvk/llava-saiga-8b)的命令如下： bash accelerate launch -m lmms_eval --model llava_hf --model_args pretrained="deepvk/llava-saiga-8b" --tasks gqa-ru --batch_size 1 --log_samples --log_samples_suffix llava-saiga-8b --output_path ./logs/ 该命令将输出包含评估结果的表格，本任务的核心评价指标为**精确匹配（ExactMatch）**，即生成答案与标准答案完全一致的比例。 ## 引用信息 @inproceedings{hudson2019gqa, title={GQA：面向真实世界视觉推理与组合式问答的新型数据集}, author={Hudson, Drew A and Manning, Christopher D}, booktitle={IEEE/CVF计算机视觉与模式识别会议论文集}, pages={6700--6709}, year={2019} } @misc{deepvk2024gqa_ru, title={GQA-ru}, author={Belopolskih, Daniil and Spirin, Egor}, url={https://huggingface.co/datasets/deepvk/GQA-ru}, publisher={Hugging Face}, year={2024} }

提供机构：

maas

创建时间：

2025-08-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集