five

MathBridge

收藏
arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/kyudan/mathbridge
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为MathBridge,包含了大约2300万组LaTeX公式及其相对应的数学口语表达,旨在提高数学表达式的可读性。此外,该数据集还包含了公式周围的上下文句子,这有助于理解这些公式在口语环境中的应用。规模上,大约有2300万组配对数据。该数据集的任务是将口语数学表达式翻译成LaTeX公式。

This dataset, named MathBridge, contains approximately 23 million pairs of LaTeX formulas and their corresponding spoken mathematical expressions, with the goal of improving the readability of mathematical expressions. Additionally, the dataset includes contextual sentences surrounding the formulas, which facilitates understanding of the application scenarios of these formulas in spoken language contexts. In terms of scale, it has roughly 23 million paired data samples. The task of this dataset is to translate spoken mathematical expressions into LaTeX formulas.
提供机构:
arXiv and open-source textbook publishers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作