Synthetic English Restaurant Receipts Datapack
收藏Snowflake2024-09-11 更新2024-09-12 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZ1MOZ7BX1O
下载链接
链接失效反馈官方服务:
资源简介:
This datapack offers a comprehensive collection of synthetic restaurant receipts in English, designed for training machine learning models on document understanding and processing tasks. The dataset includes receipts with varying structures, itemizations, UK address and logos, and total calculations, all created in a high-fidelity 3D environment. The documents simulate real-world conditions such as crumpling, smudging, and lighting inconsistencies to enhance model resilience. Each receipt is annotated with detailed field-level information, including item names, prices, and totals, allowing for precise model training. This dataset is ideal for those developing solutions for automated receipt scanning and expense management systems.
This datapack includes three tables: ANNOTATION_VIEW, IMAGE_VIEW, and ZIP_VIEW.
**ANNOTATION_VIEW** contains information for each annotation field including the name of the field, the text within the field, 4 corner coordinates of the field in clockwise order, and the name of the image this annotation belongs to.
**IMAGE_VIEW** contains information for each image including its name, its size, its URL, and the coordinates of the document corners in the image.
**ZIP_VIEW** contains the URL to download the zip file containing all images and annotations in the format of Mindtech, ICDAR2015 and Wildreceipt.
Please contact Mindtech for the full datapack.
提供机构:
Mindtech Global
创建时间:
2024-08-30
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个合成英文餐厅收据集合,专为训练文档理解与处理的机器学习模型设计,包含高保真3D环境生成的收据,模拟了褶皱、污渍等真实场景以增强模型鲁棒性。每张收据均提供详细的字段级标注,如商品名称和价格,适用于自动化收据扫描与费用管理系统的开发。数据集包含ANNOTATION_VIEW、IMAGE_VIEW和ZIP_VIEW三个表格,分别存储标注信息、图像数据和下载链接。
以上内容由遇见数据集搜集并总结生成



