Volavion/FineSightBench-Large

Name: Volavion/FineSightBench-Large
Creator: Volavion
Published: 2026-04-23 12:49:17
License: 暂无描述

Hugging Face2026-04-23 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/Volavion/FineSightBench-Large

下载链接

链接失效反馈

官方服务：

资源简介：

FineSightBench-Large 是 FineSightBench 的 10 倍扩展版本，用于评估视觉语言模型（VLMs）在像素级感知和推理任务上的表现。数据集包含两种互补的图像模式：1) 合成画布——具有精确尺寸的几何/语义目标（字母、动物、形状、块、点）的白色背景图像；2) 野外文本（SynthText 风格）——将英语单词渲染到真实自然场景照片上，具有像素级精确的字符高度控制。所有图像均为 448 × 448 像素，主要难度轴为目标像素大小（文本的字符高度），分为极端/困难/中等/简单四个级别。数据集分为 perception（感知）和 reasoning（推理）两个部分，分别包含 42,000 和 39,200 个样本，涵盖多种任务类型和图像模式。

FineSightBench-Large is a 10× scaled edition of FineSightBench, designed for evaluating Vision-Language Models (VLMs) on fine-grained visual perception and reasoning tasks. The dataset combines two complementary image regimes: 1) Synthetic canvas — controlled white-background images with precisely-sized geometric/semantic targets (letters, animals, shapes, blocks, dots); 2) Text in the wild (SynthText-style) — English words rendered onto real natural-scene photographs with pixel-accurate control of character cap-height. All images are 448 × 448 px, with the primary difficulty axis being the target pixel size (cap-height for text), categorized into extreme/hard/medium/easy levels. The dataset is split into perception and reasoning sections, containing 42,000 and 39,200 samples respectively, covering various task types and image regimes.

提供机构：

Volavion

5,000+

优质数据集

54 个

任务类型

进入经典数据集