deepcopy/UniMER

Name: deepcopy/UniMER
Creator: deepcopy
Published: 2025-06-18 09:40:31
License: 暂无描述

Hugging Face2025-06-18 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/deepcopy/UniMER

下载链接

链接失效反馈

官方服务：

资源简介：

UniMER数据集是专门为通用数学表达式识别（MER）发布的数据集。它包含了真实全面的UniMER-1M训练集，拥有超过一百万个代表广泛和复杂数学表达式的实例，以及精心设计的UniMER测试集，用于在真实世界场景中评估MER模型。数据集详情如下：- UniMER-1M 训练集：总样本数1,061,791，组成简洁与复杂、扩展公式表达式的平衡融合，目标帮助训练鲁棒性强、高精度的MER模型，增强识别准确性和模型泛化能力。- UniMER 测试集：总样本数23,757，分为简单印刷表达式（SPE）、复杂印刷表达式（CPE）、屏幕截图表达式（SCE）和手写表达式（HWE）四种类型，用于全面评估真实场景下各类公式识别能力。

The UniMER dataset is a specialized collection curated to advance the field of Mathematical Expression Recognition (MER). It includes the comprehensive UniMER-1M training set with over one million instances representing a diverse and intricate range of mathematical expressions, as well as the meticulously designed UniMER Test Set for benchmarking MER models against real-world scenarios. Details of the dataset are as follows: - UniMER-1M Training Set: Total samples 1,061,791, composed of a balanced mix of concise and complex, extended formula expressions, aimed at training robust, high-accuracy MER models to enhance recognition precision and generalization. - UniMER Test Set: Total samples 23,757, categorized into four types of expressions: Simple Printed Expressions (SPE), Complex Printed Expressions (CPE), Screen Capture Expressions (SCE), and Handwritten Expressions (HWE), to provide a thorough evaluation of MER models across a spectrum of real-world conditions.

提供机构：

deepcopy

5,000+

优质数据集

54 个

任务类型

进入经典数据集