Paper2Fig100k
收藏魔搭社区2025-08-18 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/OmniData/Paper2Fig100k
下载链接
链接失效反馈官方服务:
资源简介:
displayName: Paper2Fig100k
license:
- CC BY 4.0
paperUrl: https://arxiv.org//pdf/2210.11248.pdf
publishDate: "2022-11-07"
publishUrl: https://zenodo.org/record/7299423#.Y2xxEXZBy5e
publisher:
- Computer Vision Center
- ServiceNow Research
- École de technologie supérieure
tags:
- Paper graphics
- Paper text
---
# 数据集介绍
## 简介
来自研究论文的超过100k个图形图像和文本标题的数据集。图形图像显示我n arXiv.org的研究论文的图表、方法和架构。我们还为每个图形提供文本标题,以及对图形的OCR检测和识别 (边界框和文本)。数据集结构由一个名为 “数字” 的目录和两个JSON文件 (训练和测试) 组成,其中包含每个图形的数据。每个JSON对象包含有关图形的以下信息: figure_id: 基于arXiv标识符的图形标识: 。-图形 -.png.ca选项: 从与图形相关的论文中提取的文本对。例如,该图的实际标题或对手稿中该图的引用。ocr_result: 在图像上执行OCR文本识别的结果。我们提供图像中存在的三胞胎 (边界框,置信度,文本) 的列表。方面: 图像的纵横比 (H/W)。
## Download dataset
:modelscope-code[]{type="git"}
displayName: Paper2Fig100k
license:
- CC BY 4.0
paperUrl: https://arxiv.org//pdf/2210.11248.pdf
publishDate: "2022-11-07"
publishUrl: https://zenodo.org/record/7299423#.Y2xxEXZBy5e
publisher:
- Computer Vision Center
- ServiceNow Research
- École de technologie supérieure
tags:
- Paper graphics
- Paper text
---
# Dataset Introduction
## Overview
This dataset contains over 100k graphic images and text captions sourced from research papers. The graphic images include figures, diagrams, method schematics and architectural plots from research papers hosted on arXiv.org. We also provide text captions for each graphic, as well as OCR detection and recognition results for the graphics, including bounding boxes and recognized text.
The dataset structure consists of a directory named "data" and two JSON files (train and test) that store data for each graphic. Each JSON object includes the following information about the graphic:
- figure_id: the graphic identifier based on arXiv identifiers.
- caption: text pairs extracted from the paper associated with the graphic, such as the actual caption of the figure or a reference to this figure in the manuscript.
- ocr_result: the results of OCR text recognition performed on the image. We provide a list of triplets (bounding box, confidence score, recognized text) present in the image.
- aspect_ratio: the aspect ratio of the image (H/W).
## Download Dataset
:modelscope-code[]{type="git"}
提供机构:
maas
创建时间:
2024-07-15



