ZexiJia/Visform
收藏Hugging Face2026-03-12 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ZexiJia/Visform
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: VisForm
language:
- en
license: other
task_categories:
- image-classification
- text-to-image
tags:
- cvpr-2026
- benchmark
- generative-model-evaluation
- image-quality-assessment
- aesthetics
- safety
- human-annotations
- computer-vision
- multimodal
size_categories:
- 100K<n<1M
---
# VisForm
<div align="center">
### A large-scale benchmark for evaluating generative image models across diverse visual forms
**210K Images** • **62 Visual Forms** • **12 Generative Models**
**Expert Annotations** for **Quality**, **Aesthetics**, and **Safety**
[📄 Paper](https://arxiv.org/abs/2603.08064)
</div>
---
## Overview
**VisForm** is a large-scale benchmark for evaluating generative image models under broad distribution shifts.
Unlike benchmarks centered mostly on photorealistic imagery, VisForm covers a much wider spectrum of visual content, including photography, painting, illustration, diagrams, scientific imagery, UI-like graphics, sensor-style images, and design elements.
It is designed for:
- cross-domain generative model evaluation
- image quality metric benchmarking
- metric–human alignment analysis
- quality, aesthetics, and safety assessment
---
## Highlights
- **210,000 images**
- **62 visual forms**
- **12 representative generative models**
- **14 perceptual dimensions**
- **At least 3 expert annotators per image**
---
## What makes VisForm useful?
VisForm is built for settings where many existing evaluation benchmarks and metrics become less reliable, especially on:
- artistic imagery
- symbolic or structured graphics
- text-heavy layouts
- scientific and medical visualizations
- functional images such as depth maps and other sensor outputs
By explicitly covering these diverse forms, VisForm provides a stronger testbed for evaluating robustness beyond natural photos.
---
## Dataset Content
Each sample is associated with structured annotations such as:
- **visual form**
- **source model**
- **fine-grained artifact labels**
- **5-point expert ratings**
The benchmark focuses on three major aspects:
### Quality
Measures whether generated content is complete, legible, clear, and physically plausible.
### Aesthetics
Measures visual appeal, composition, color harmony, and stylistic coherence.
### Safety
Captures safety-related properties including harmful content, risky behavior, discrimination, intellectual property concerns, and the obviousness of generative artifacts.
---
## Visual Forms
VisForm spans **14 high-level categories**, including:
- General Photography
- Specialized Photography
- Traditional Painting
- Creative and Conceptual Art
- Illustration and Comics
- Crafts
- Sculpture and Objects
- Digital Graphics
- Scientific Imaging
- Diagrams
- Data Visualization
- Sensor Data
- Patterns
- Design Elements
Representative examples include **realistic photos, sketches, film posters, paper cutting, Chinese ink painting, CT images, infographics, charts, depth maps, textures, and collages**.
---
## Use Cases
VisForm is intended for:
- benchmarking generative image models
- evaluating automatic image quality metrics
- studying robustness under domain shift
- analyzing expert judgments of generated images
- comparing model families across visual forms
- developing new evaluation metrics for quality, aesthetics, and safety
---
## Paper
**Evaluating Generative Models via One-Dimensional Code Distributions**
**Zexi Jia, Pengcheng Luo, Yijia Zhong, Jinchao Zhang, Jie Zhou**
**CVPR 2026**
[arXiv: 2603.08064](https://arxiv.org/abs/2603.08064)
---
## Citation
If you use **VisForm** in your research, please cite:
```bibtex
@article{jia2026evaluating,
title={Evaluating Generative Models via One-Dimensional Code Distributions},
author={Jia, Zexi and Luo, Pengcheng and Zhong, Yijia and Zhang, Jinchao and Zhou, Jie},
journal={arXiv preprint arXiv:2603.08064},
year={2026}
}
pretty_name: VisForm
language:
- en
license: other
task_categories:
- 图像分类
- 文本到图像生成
tags:
- CVPR 2026
- 基准测试
- 生成模型评估
- 图像质量评估
- 美学质量
- 安全性
- 人工标注
- 计算机视觉
- 多模态
size_categories:
- 100K<n<1M
# VisForm
<div align="center">
### 面向多样化视觉形式的生成式图像模型(Generative Image Models)大规模评估基准
**21万张图像** • **62种视觉形式** • **12款生成式模型(Generative Models)**
**针对质量、美学与安全性的专家标注**
[📄 论文](https://arxiv.org/abs/2603.08064)
</div>
## 数据集概述
**VisForm**是一款面向广泛分布偏移场景的生成式图像模型评估大规模基准。
与多数以写实图像为核心的基准不同,VisForm涵盖了更广泛的视觉内容范畴,包括摄影作品、绘画、插画、图表、科学可视化图像、类UI图形、传感器风格图像以及设计元素。
本基准的设计目标包括:
- 跨域生成式模型评估
- 图像质量指标基准测试
- 指标与人类判断对齐性分析
- 质量、美学与安全性评估
## 基准核心特性
- **21万张图像**
- **62种视觉形式**
- **12款代表性生成式模型**
- **14个感知维度**
- **每张图像至少由3名专家标注**
## 基准的独特价值
VisForm专为当前多数评估基准与指标可靠性下降的场景设计,尤其适用于以下场景:
- 艺术创作图像
- 符号化或结构化图形
- 文本密集型布局
- 科学与医学可视化图像
- 功能型图像(如深度图及其他传感器输出图像)
通过明确覆盖这些多样化的视觉形式,VisForm为评估自然图像之外的模型鲁棒性提供了更严谨的测试平台。
## 数据集内容
每个样本均包含以下结构化标注:
- **视觉形式**
- **来源模型**
- **细粒度伪影标注**
- **5级专家评分**
本基准聚焦三大核心评估维度:
### 质量维度
评估生成内容的完整性、可读性、清晰度与物理合理性。
### 美学维度
评估视觉吸引力、构图、色彩和谐度与风格一致性。
### 安全性维度
评估与安全性相关的属性,包括有害内容、风险行为、歧视性内容、知识产权问题以及生成伪影的显著性。
## 视觉形式分类
VisForm涵盖**14个一级类别**,具体包括:
- 通用摄影
- 专业摄影
- 传统绘画
- 创意与概念艺术
- 插画与漫画
- 手工艺品
- 雕塑与实体物体
- 数字图形
- 科学成像
- 示意图
- 数据可视化
- 传感器数据
- 图案纹理
- 设计元素
典型示例包括**写实照片、素描、电影海报、剪纸、中国水墨画、CT图像、信息图、图表、深度图、纹理以及拼贴画**。
## 应用场景
VisForm可应用于以下场景:
- 生成式图像模型基准测试
- 自动图像质量指标评估
- 域偏移下的模型鲁棒性研究
- 生成图像的专家判断分析
- 跨视觉形式的模型家族对比
- 开发面向质量、美学与安全性的新型评估指标
## 相关论文
**《基于一维代码分布的生成式模型评估》**
**作者:贾泽熙、罗鹏程、钟依佳、张锦超、周杰**
**CVPR 2026会议论文**
[arXiv: 2603.08064](https://arxiv.org/abs/2603.08064)
## 引用格式
若您在研究中使用**VisForm**,请引用如下文献:
bibtex
@article{jia2026evaluating,
title={Evaluating Generative Models via One-Dimensional Code Distributions},
author={Jia, Zexi and Luo, Pengcheng and Zhong, Yijia and Zhang, Jinchao and Zhou, Jie},
journal={arXiv preprint arXiv:2603.08064},
year={2026}
}
提供机构:
ZexiJia



