vkehfdl1/banana-merged

Name: vkehfdl1/banana-merged
Creator: vkehfdl1
Published: 2026-04-24 08:39:33
License: 暂无描述

Hugging Face2026-04-24 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/vkehfdl1/banana-merged

下载链接

链接失效反馈

官方服务：

资源简介：

Banana-Merged是一个合成的多页视觉问答数据集，包含硬负样本，专为微调视觉文档检索模型（如ColFlor和ColPali）而设计。数据集包含1,100个训练样本和10,054张图像（正样本页面和硬负样本变体）。每个样本将一个多页分析查询与一组包含答案的文档图像配对，外加一个或多个视觉和词汇相似但检索结果错误的硬负样本文档。数据集通过Nano Banana Pro管道生成，使用Gemini 3 Pro Image Preview进行图像生成和编辑。硬负样本对于训练检索模型区分表面相似但语义不同的文档至关重要。数据集仅支持英语，结构包括查询、正样本页面、硬负样本页面等字段，适用于视觉文档检索微调、多页VQA和硬负样本挖掘研究。

Banana-Merged is a synthetic multi-page visual question answering dataset with hard negatives, designed for fine-tuning visual document retrievers like ColFlor and ColPali. The dataset contains 1,100 training samples and 10,054 images (positive pages + hard negative variants). Each sample pairs a multi-page analytical query with a set of document images that collectively contain the answer, plus one or more hard negative documents that look visually and lexically similar but would be the wrong retrieval result. The dataset was generated by the Nano Banana Pro pipeline using Gemini 3 Pro Image Preview for image generation and editing. Hard negatives are critical for training retrieval models to distinguish between documents that share surface-level keywords and layout but differ in the specific information needed to answer a query. The dataset is English-only and supports tasks like visual document retrieval fine-tuning, multi-page VQA, and hard negative mining research.

提供机构：

vkehfdl1

5,000+

优质数据集

54 个

任务类型

进入经典数据集