table-extraction
收藏魔搭社区2026-01-09 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/keremberke/table-extraction
下载链接
链接失效反馈官方服务:
资源简介:
<div align="center">
<img width="640" alt="keremberke/table-extraction" src="https://huggingface.co/datasets/keremberke/table-extraction/resolve/main/thumbnail.jpg">
</div>
### Dataset Labels
```
['bordered', 'borderless']
```
### Number of Images
```json
{'test': 34, 'train': 238, 'valid': 70}
```
### How to Use
- Install [datasets](https://pypi.org/project/datasets/):
```bash
pip install datasets
```
- Load the dataset:
```python
from datasets import load_dataset
ds = load_dataset("keremberke/table-extraction", name="full")
example = ds['train'][0]
```
### Roboflow Dataset Page
[https://universe.roboflow.com/mohamed-traore-2ekkp/table-extraction-pdf/dataset/2](https://universe.roboflow.com/mohamed-traore-2ekkp/table-extraction-pdf/dataset/2?ref=roboflow2huggingface)
### Citation
```
```
### License
CC BY 4.0
### Dataset Summary
This dataset was exported via roboflow.com on January 18, 2023 at 9:41 AM GMT
Roboflow is an end-to-end computer vision platform that helps you
* collaborate with your team on computer vision projects
* collect & organize images
* understand and search unstructured image data
* annotate, and create datasets
* export, train, and deploy computer vision models
* use active learning to improve your dataset over time
For state of the art Computer Vision training notebooks you can use with this dataset,
visit https://github.com/roboflow/notebooks
To find over 100k other datasets and pre-trained models, visit https://universe.roboflow.com
The dataset includes 342 images.
Data-table are annotated in COCO format.
The following pre-processing was applied to each image:
* Auto-orientation of pixel data (with EXIF-orientation stripping)
No image augmentation techniques were applied.
<div align="center">
<img width="640" alt="keremberke/table-extraction" src="https://huggingface.co/datasets/keremberke/table-extraction/resolve/main/thumbnail.jpg">
</div>
### 数据集标签
['带边框(bordered)', '无边框(borderless)']
### 图像数量
json
{'test': 34, 'train': 238, 'valid': 70}
### 使用方法
- 安装[datasets](https://pypi.org/project/datasets/)库:
bash
pip install datasets
- 加载该数据集:
python
from datasets import load_dataset
ds = load_dataset("keremberke/table-extraction", name="full")
example = ds['train'][0]
### Roboflow 数据集页面
[https://universe.roboflow.com/mohamed-traore-2ekkp/table-extraction-pdf/dataset/2](https://universe.roboflow.com/mohamed-traore-2ekkp/table-extraction-pdf/dataset/2?ref=roboflow2huggingface)
### 引用
### 许可证
CC BY 4.0
### 数据集概述
本数据集于2023年1月18日格林威治标准时间上午9:41通过roboflow.com导出。
Roboflow是一款端到端计算机视觉平台,可助力您完成以下工作:
* 与团队协作开展计算机视觉项目
* 收集并整理图像素材
* 理解并检索非结构化图像数据
* 进行图像标注并构建数据集
* 导出、训练并部署计算机视觉模型
* 运用主动学习方法,随时间迭代优化数据集
若需获取适配本数据集的顶尖计算机视觉训练笔记,可访问 https://github.com/roboflow/notebooks
若需查找超过10万个其他数据集与预训练模型,可访问 https://universe.roboflow.com
本数据集共包含342张图像,所有数据表均采用COCO(Common Objects in Context)格式进行标注。
已对每张图像执行以下预处理操作:
* 自动校正像素数据方向(同时移除EXIF方向元信息)
未应用任何图像增强技术。
提供机构:
maas
创建时间:
2025-10-04



