five

Volavion/FineSightBench

收藏
Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Volavion/FineSightBench
下载链接
链接失效反馈
官方服务:
资源简介:
FineSightBench是一个细粒度的视觉基准测试数据集,用于评估视觉语言模型(VLMs)在像素级感知和推理任务上的表现。它结合了两种互补的图像模式:1)合成画布——具有精确尺寸的几何/语义目标(字母、动物、形状、块、点)的白色背景图像;2)真实场景中的文本(SynthText风格)——将英文单词渲染到来自SynthText `bg_img` 集的真实自然场景照片上,并精确控制字符的像素高度。所有图像均为448×448像素,主要难度轴为目标像素大小(文本的字符高度),分为极端/困难/中等/简单四个级别。数据集分为perception和reasoning两个部分,分别包含4,200和3,920个样本,涵盖多种任务类型和难度级别。

FineSightBench is a fine-grained visual benchmark for evaluating Vision-Language Models (VLMs) on pixel-level perception and reasoning tasks. It combines two complementary image regimes: 1) Synthetic canvas — controlled white-background images with precisely-sized geometric/semantic targets (letters, animals, shapes, blocks, dots); 2) Text in the wild (SynthText-style) — English words rendered onto real natural-scene photographs from the SynthText `bg_img` set, with pixel-accurate control of character cap-height. All images are 448 × 448 px, and the primary difficulty axis is the target pixel size (cap-height for text), categorized into extreme/hard/medium/easy levels. The dataset is divided into perception and reasoning splits, containing 4,200 and 3,920 samples respectively, covering various task types and difficulty levels.
提供机构:
Volavion
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作