FelipeBandeiraPoatek/evaluation
收藏Hugging Face2023-07-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/FelipeBandeiraPoatek/evaluation
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: ground_truth
dtype: string
splits:
- name: train
num_bytes: 234024421
num_examples: 425
- name: test
num_bytes: 14512665
num_examples: 26
- name: validation
num_bytes: 27661738
num_examples: 50
download_size: 197512750
dataset_size: 276198824
license: mit
task_categories:
- feature-extraction
language:
- en
pretty_name: Sparrow Invoice Dataset
size_categories:
- n<1K
---
# Dataset Card for Invoices (Sparrow)
This dataset contains 500 invoice documents annotated and processed to be ready for Donut ML model fine-tuning.
Annotation and data preparation task was done by [Katana ML](https://www.katanaml.io) team.
[Sparrow](https://github.com/katanaml/sparrow/tree/main) - open-source data extraction solution by Katana ML.
Original dataset [info](https://data.mendeley.com/datasets/tnj49gpmtz): Kozłowski, Marek; Weichbroth, Paweł (2021), “Samples of electronic invoices”, Mendeley Data, V2, doi: 10.17632/tnj49gpmtz.2
提供机构:
FelipeBandeiraPoatek
原始信息汇总
数据集概述
基本信息
- 名称: Sparrow Invoice Dataset
- 许可证: MIT
- 语言: 英语 (en)
- 任务类别: 特征提取 (feature-extraction)
- 大小类别: 小于1千 (n<1K)
数据集结构
- 特征:
- image: 图像类型
- ground_truth: 字符串类型
数据分割
- 训练集:
- 示例数量: 425
- 存储大小: 234024421 字节
- 测试集:
- 示例数量: 26
- 存储大小: 14512665 字节
- 验证集:
- 示例数量: 50
- 存储大小: 27661738 字节
数据集大小
- 下载大小: 197512750 字节
- 数据集总大小: 276198824 字节



