CORD
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/clovaai/cord
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个专为后OCR解析而整合的收据数据集,包含了1000个收据样本,用于关键信息提取。它涵盖了4个类别下的30个标签,例如“总计”和“小计”。数据规模分为800个训练样本、100个验证样本以及100个测试样本。该数据集的任务是进行关键信息提取。
This dataset is a receipt dataset integrated specifically for post-OCR parsing tasks, containing 1000 receipt samples for key information extraction. It covers 30 labels across 4 categories, such as "Total" and "Subtotal". The dataset is partitioned into 800 training samples, 100 validation samples, and 100 test samples. The core task of this dataset is key information extraction.



