Volavion/FineSightBench

Name: Volavion/FineSightBench
Creator: Volavion
Published: 2026-04-23 12:46:12
License: 暂无描述

Hugging Face2026-04-23 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/Volavion/FineSightBench

下载链接

链接失效反馈

官方服务：

资源简介：

FineSightBench是一个细粒度的视觉基准测试数据集，用于评估视觉语言模型（VLMs）在像素级感知和推理任务上的表现。它结合了两种互补的图像模式：1）合成画布——具有精确尺寸的几何/语义目标（字母、动物、形状、块、点）的白色背景图像；2）真实场景中的文本（SynthText风格）——将英文单词渲染到来自SynthText `bg_img` 集的真实自然场景照片上，并精确控制字符的像素高度。所有图像均为448×448像素，主要难度轴为目标像素大小（文本的字符高度），分为极端/困难/中等/简单四个级别。数据集分为perception和reasoning两个部分，分别包含4,200和3,920个样本，涵盖多种任务类型和难度级别。

FineSightBench is a fine-grained visual benchmark for evaluating Vision-Language Models (VLMs) on pixel-level perception and reasoning tasks. It combines two complementary image regimes: 1) Synthetic canvas — controlled white-background images with precisely-sized geometric/semantic targets (letters, animals, shapes, blocks, dots); 2) Text in the wild (SynthText-style) — English words rendered onto real natural-scene photographs from the SynthText `bg_img` set, with pixel-accurate control of character cap-height. All images are 448 × 448 px, and the primary difficulty axis is the target pixel size (cap-height for text), categorized into extreme/hard/medium/easy levels. The dataset is divided into perception and reasoning splits, containing 4,200 and 3,920 samples respectively, covering various task types and difficulty levels.

提供机构：

Volavion

5,000+

优质数据集

54 个

任务类型

进入经典数据集