five

CSU-JPG/TextEditBench

收藏
Hugging Face2026-03-09 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/CSU-JPG/TextEditBench
下载链接
链接失效反馈
官方服务:
资源简介:
TextEditBench是一个用于评估超越渲染的推理感知文本编辑能力的综合基准。它明确关注14个主题和6种任务类型的文本中心区域,强调需要理解物理合理性、语言意义和跨模态依赖的推理密集型场景。数据集包含1,196个高质量实例,通过严格的人工-AI-人工验证流程筛选,结合了手动制作(58%)和网络来源(42%)的实例。数据集涵盖了14个多样化的主题,包括专业文档、数字界面、标志、菜单和包装等日常视觉场景,以及6种原子操作(删除、插入、更改、重新定位、缩放和属性转移)来测试特定能力。每个实例根据10个难度属性评分(0-20),并分为简单、中等和困难三个等级,以进行细粒度的模型鲁棒性分析。

TextEditBench is a comprehensive benchmark for evaluating Reasoning-aware Text Editing beyond mere rendering. It explicitly focuses on text-centric regions across 14 topics and 6 task types, emphasizing reasoning-intensive scenarios that require models to understand physical plausibility, linguistic meaning, and cross-modal dependencies. The dataset comprises 1,196 high-quality instances, curated through a rigorous Human-AI-Human verification pipeline, combining Manual Production (58%) with Web-sourced instances (42%). It covers 14 diverse topics, including Professional Documents, Digital Interfaces, Signage, Menus, and Packaging, and 6 atom operations (Delete, Insert, Change, Relocation, Scaling, and Attribute transfer) designed to test specific capabilities. Each instance is scored (0-20) based on 10 difficulty attributes and categorized into Easy, Medium, and Hard tiers, enabling fine-grained analysis of model robustness.
提供机构:
CSU-JPG
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作