five

Volavion/FineSightBench-Large

收藏
Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Volavion/FineSightBench-Large
下载链接
链接失效反馈
官方服务:
资源简介:
FineSightBench-Large 是 FineSightBench 的 10 倍扩展版本,用于评估视觉语言模型(VLMs)在像素级感知和推理任务上的表现。数据集包含两种互补的图像模式:1) 合成画布——具有精确尺寸的几何/语义目标(字母、动物、形状、块、点)的白色背景图像;2) 野外文本(SynthText 风格)——将英语单词渲染到真实自然场景照片上,具有像素级精确的字符高度控制。所有图像均为 448 × 448 像素,主要难度轴为目标像素大小(文本的字符高度),分为极端/困难/中等/简单四个级别。数据集分为 perception(感知)和 reasoning(推理)两个部分,分别包含 42,000 和 39,200 个样本,涵盖多种任务类型和图像模式。

FineSightBench-Large is a 10× scaled edition of FineSightBench, designed for evaluating Vision-Language Models (VLMs) on fine-grained visual perception and reasoning tasks. The dataset combines two complementary image regimes: 1) Synthetic canvas — controlled white-background images with precisely-sized geometric/semantic targets (letters, animals, shapes, blocks, dots); 2) Text in the wild (SynthText-style) — English words rendered onto real natural-scene photographs with pixel-accurate control of character cap-height. All images are 448 × 448 px, with the primary difficulty axis being the target pixel size (cap-height for text), categorized into extreme/hard/medium/easy levels. The dataset is split into perception and reasoning sections, containing 42,000 and 39,200 samples respectively, covering various task types and image regimes.
提供机构:
Volavion
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作