CSU-JPG/TextEditBench
收藏Hugging Face2026-03-09 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/CSU-JPG/TextEditBench
下载链接
链接失效反馈官方服务:
资源简介:
TextEditBench是一个用于评估超越渲染的推理感知文本编辑能力的综合基准。它明确关注14个主题和6种任务类型的文本中心区域,强调需要理解物理合理性、语言意义和跨模态依赖的推理密集型场景。数据集包含1,196个高质量实例,通过严格的人工-AI-人工验证流程筛选,结合了手动制作(58%)和网络来源(42%)的实例。数据集涵盖了14个多样化的主题,包括专业文档、数字界面、标志、菜单和包装等日常视觉场景,以及6种原子操作(删除、插入、更改、重新定位、缩放和属性转移)来测试特定能力。每个实例根据10个难度属性评分(0-20),并分为简单、中等和困难三个等级,以进行细粒度的模型鲁棒性分析。
TextEditBench is a comprehensive benchmark for evaluating Reasoning-aware Text Editing beyond mere rendering. It explicitly focuses on text-centric regions across 14 topics and 6 task types, emphasizing reasoning-intensive scenarios that require models to understand physical plausibility, linguistic meaning, and cross-modal dependencies. The dataset comprises 1,196 high-quality instances, curated through a rigorous Human-AI-Human verification pipeline, combining Manual Production (58%) with Web-sourced instances (42%). It covers 14 diverse topics, including Professional Documents, Digital Interfaces, Signage, Menus, and Packaging, and 6 atom operations (Delete, Insert, Change, Relocation, Scaling, and Attribute transfer) designed to test specific capabilities. Each instance is scored (0-20) based on 10 difficulty attributes and categorized into Easy, Medium, and Hard tiers, enabling fine-grained analysis of model robustness.
提供机构:
CSU-JPG



