five

GQA-ru

收藏
魔搭社区2025-12-05 更新2025-08-02 收录
下载链接:
https://modelscope.cn/datasets/deepvk/GQA-ru
下载链接
链接失效反馈
官方服务:
资源简介:
# GQA-ru This is a translated version of original [GQA](https://cs.stanford.edu/people/dorarad/gqa/about.html) dataset and stored in format supported for [`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval) pipeline. For this dataset, we: 1. Translate the original one with `gpt-4-turbo` 2. Filter out unsuccessful translations, i.e. where the model protection was triggered 3. Manually validate most common errors ## Dataset Structure Dataset includes both train and test splits translated from original `train_balanced` and `testdev_balanced`. Train split includes 27519 images with 40000 questions to them and test split contains 398 images with 12216 different question to them. Storage format is similar to [`lmms-lab/GQA`](https://huggingface.co/datasets/lmms-lab/GQA), key fields: * `id`: ID of a question * `imageId`: ID of an image (images stored in a separate table) * `question`: text of a question * `answer`: one word answer * `fullAnswer`: detailed answer ## Usage The easiest way to evaluate model on `GQA-ru` is through [`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval) For example, to evaluate [`deepvk/llava-saiga-8b`](https://huggingface.co/deepvk/llava-saiga-8b): ```bash accelerate launch -m lmms_eval --model llava_hf \ --model_args pretrained="deepvk/llava-saiga-8b" \ --tasks gqa-ru --batch_size 1 \ --log_samples --log_samples_suffix llava-saiga-8b --output_path ./logs/ ``` This would print a table with a result, the main metric for this task is `ExactMatch` for one word answer -- whether generated word is completely similar to ground truth. ## Citation ``` @inproceedings{hudson2019gqa, title={Gqa: A new dataset for real-world visual reasoning and compositional question answering}, author={Hudson, Drew A and Manning, Christopher D}, booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition}, pages={6700--6709}, year={2019} } ``` ``` @misc{deepvk2024gqa_ru, title={GQA-ru}, author={Belopolskih, Daniil and Spirin, Egor}, url={https://huggingface.co/datasets/deepvk/GQA-ru}, publisher={Hugging Face} year={2024}, } ```

# GQA-ru 本数据集为原始[GQA](https://cs.stanford.edu/people/dorarad/gqa/about.html)的译制版,采用适配[`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval)流水线的格式存储。 针对本数据集,我们完成了以下工作: 1. 使用`gpt-4-turbo`对原始数据集进行翻译 2. 过滤掉翻译失败的样本,即触发模型内容保护机制的翻译结果 3. 对常见翻译错误进行人工校验 ## 数据集结构 数据集包含从原始`train_balanced`与`testdev_balanced`拆分转换而来的训练集与测试集。训练集涵盖27519张图像及对应40000个问题,测试集包含398张图像及12216个不同的问题。 其存储格式与[`lmms-lab/GQA`](https://huggingface.co/datasets/lmms-lab/GQA)一致,核心字段如下: * `id`:问题唯一标识符 * `imageId`:图像ID(图像单独存储于独立数据表中) * `question`:问题文本 * `answer`:单字词标准答案 * `fullAnswer`:详细标准答案 ## 使用方式 通过[`lmms-eval`](https://github.com/EvolvingLMMs-Lab/lmms-eval)是在GQA-ru上评估模型的最简途径。 例如,评估[`deepvk/llava-saiga-8b`](https://huggingface.co/deepvk/llava-saiga-8b)的命令如下: bash accelerate launch -m lmms_eval --model llava_hf --model_args pretrained="deepvk/llava-saiga-8b" --tasks gqa-ru --batch_size 1 --log_samples --log_samples_suffix llava-saiga-8b --output_path ./logs/ 该命令将输出包含评估结果的表格,本任务的核心评价指标为**精确匹配(ExactMatch)**,即生成答案与标准答案完全一致的比例。 ## 引用信息 @inproceedings{hudson2019gqa, title={GQA:面向真实世界视觉推理与组合式问答的新型数据集}, author={Hudson, Drew A and Manning, Christopher D}, booktitle={IEEE/CVF计算机视觉与模式识别会议论文集}, pages={6700--6709}, year={2019} } @misc{deepvk2024gqa_ru, title={GQA-ru}, author={Belopolskih, Daniil and Spirin, Egor}, url={https://huggingface.co/datasets/deepvk/GQA-ru}, publisher={Hugging Face}, year={2024} }
提供机构:
maas
创建时间:
2025-08-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作