omron-sinicx/scipostlayout_v2
收藏Hugging Face2024-07-31 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/omron-sinicx/scipostlayout_v2
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
language:
- en
size_categories:
- 1K<n<10K
---
This is the dataset repository of a paper ''SciPostLayout: A Dataset for Layout Analysis and Layout Generation of Scientific Posters''.
This dataset includes 6,855 poster images in train set, 500 poster images in dev/test sets, respectively.
This dataset also includes 100 papers paired with 100 posters in dev/test sets.
The file structure is as follows.
- scipostlayout
- poster
- png
- train.zip
- 6,855 poster images of train set.
- dev
- 500 poster images of dev set.
- test
- 500 poster images of test set.
- train.json
- annotation data of train set.
- dev.json
- annotation data of valid set.
- test.json
- annotation data of test set.
- load_dataset.py
- example script to load the dataset.
- pdf
- poster pdf files (optional).
- paper
- pdf
- paper pdf files.
- jpg
- paper jpg files.
- mmd
- mmd files of papers parsed with Nougat.
许可证:知识共享署名4.0(CC BY 4.0)
语言:
- 英语
规模类别:
- 1000 < 样本数 < 10000
---
本数据集仓库配套论文《SciPostLayout:面向学术海报布局分析与布局生成的数据集》。
本数据集包含训练集6855张海报图像,开发集与测试集各含500张海报图像。此外,本数据集在开发/测试集中还包含100篇与对应海报一一配对的学术论文。
数据集文件结构如下:
- scipostlayout
- poster
- png
- train.zip:包含训练集的6855张海报图像
- dev:存放开发集的500张海报图像
- test:存放测试集的500张海报图像
- train.json:训练集标注数据
- dev.json:验证集标注数据
- test.json:测试集标注数据
- load_dataset.py:数据集加载示例脚本
- pdf:海报PDF文件(可选)
- paper
- pdf:论文PDF文件
- jpg:论文JPG文件
- mmd:使用Nougat解析得到的论文mmd格式文件
提供机构:
omron-sinicx



