five

Paper2Fig100k

收藏
魔搭社区2025-08-18 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/OmniData/Paper2Fig100k
下载链接
链接失效反馈
官方服务:
资源简介:
displayName: Paper2Fig100k license: - CC BY 4.0 paperUrl: https://arxiv.org//pdf/2210.11248.pdf publishDate: "2022-11-07" publishUrl: https://zenodo.org/record/7299423#.Y2xxEXZBy5e publisher: - Computer Vision Center - ServiceNow Research - École de technologie supérieure tags: - Paper graphics - Paper text --- # 数据集介绍 ## 简介 来自研究论文的超过100k个图形图像和文本标题的数据集。图形图像显示我n arXiv.org的研究论文的图表、方法和架构。我们还为每个图形提供文本标题,以及对图形的OCR检测和识别 (边界框和文本)。数据集结构由一个名为 “数字” 的目录和两个JSON文件 (训练和测试) 组成,其中包含每个图形的数据。每个JSON对象包含有关图形的以下信息: figure_id: 基于arXiv标识符的图形标识: 。-图形 -.png.ca选项: 从与图形相关的论文中提取的文本对。例如,该图的实际标题或对手稿中该图的引用。ocr_result: 在图像上执行OCR文本识别的结果。我们提供图像中存在的三胞胎 (边界框,置信度,文本) 的列表。方面: 图像的纵横比 (H/W)。 ## Download dataset :modelscope-code[]{type="git"}

displayName: Paper2Fig100k license: - CC BY 4.0 paperUrl: https://arxiv.org//pdf/2210.11248.pdf publishDate: "2022-11-07" publishUrl: https://zenodo.org/record/7299423#.Y2xxEXZBy5e publisher: - Computer Vision Center - ServiceNow Research - École de technologie supérieure tags: - Paper graphics - Paper text --- # Dataset Introduction ## Overview This dataset contains over 100k graphic images and text captions sourced from research papers. The graphic images include figures, diagrams, method schematics and architectural plots from research papers hosted on arXiv.org. We also provide text captions for each graphic, as well as OCR detection and recognition results for the graphics, including bounding boxes and recognized text. The dataset structure consists of a directory named "data" and two JSON files (train and test) that store data for each graphic. Each JSON object includes the following information about the graphic: - figure_id: the graphic identifier based on arXiv identifiers. - caption: text pairs extracted from the paper associated with the graphic, such as the actual caption of the figure or a reference to this figure in the manuscript. - ocr_result: the results of OCR text recognition performed on the image. We provide a list of triplets (bounding box, confidence score, recognized text) present in the image. - aspect_ratio: the aspect ratio of the image (H/W). ## Download Dataset :modelscope-code[]{type="git"}
提供机构:
maas
创建时间:
2024-07-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作