johaness14/IMG2LATEX
收藏Hugging Face2025-09-25 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/johaness14/IMG2LATEX
下载链接
链接失效反馈官方服务:
资源简介:
这是一个大规模、经过清理和统一整理的数学公式图像及其对应LaTeX源代码的集合。该数据集通过组合和预处理多个公开可用的数据集而创建。数据集的目的是为训练针对数学公式的光学字符识别(OCR)模型提供一个健壮的基础,特别是针对图像到LaTeX的任务。所有图像都保持原始分辨率,以避免因调整大小或失真而导致信息丢失。数据集包含两个字段:图像字段包含原始分辨率和灰度级的公式图像;公式字段包含地面真实LaTeX公式。
This dataset is a large-scale, cleaned, and unified collection of mathematical formula images and their corresponding LaTeX source code. It was created by combining and preprocessing several publicly available datasets. The goal of this dataset is to provide a robust foundation for training Optical Character Recognition (OCR) models for mathematical equations, specifically for Image-to-LaTeX tasks. All images are preserved in their native resolution to avoid information loss from resizing or distortion. The dataset contains two columns: image with the formula image in its original resolution and grayscale, and formula with the ground-truth LaTeX formula.
提供机构:
johaness14



