docling-project/SynthFormulaNet
收藏Hugging Face2025-07-31 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/docling-project/SynthFormulaNet
下载链接
链接失效反馈官方服务:
资源简介:
SynthFormulaNet是一个包含640万个合成渲染的数学公式图像及其对应LaTeX表示的多模态数据集,用于训练文档理解的多模态模型,特别是用于公式片段的提取和转录为Latex。
SynthFormulaNet is a multimodal dataset containing over 6.4 million pairs of synthetically rendered images of mathematical formulas and their corresponding LaTeX representations, designed for training multimodal models for document understanding, especially for formula snippet extraction and transcription to LaTeX.
提供机构:
docling-project



