five

Datasets for "Reading Order Independent Metrics for Information Extraction in Handwritten Documents"

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/11083656
下载链接
链接失效反馈
官方服务:
资源简介:
This repository includes the five datasets used for our paper entitled Reading Order Independent Metrics for Information Extraction in Handwritten Documents, in which we compare various metrics to evaluate end-to-end information extraction from scanned documents. Datasets Five datasets are released following the BIO format: IAM Simara POPP Esposalles French Military Records For each dataset, we provide the following data (on test sets): Ground truth annotations (gt/) Automatic predictions (dan/) Automatic predictions with entities appearing in random order (dan_shuffled/) The data is organized as follows: ├── Dataset name/│   ├── gt/│   ├── dan/│   └── dan_shuffled/ Metrics To install the ie-eval package, run pip install ie-eval. To compute all metrics on a specific dataset, run:ie-eval all --label-dir IAM_paragraph/gt/ --prediction-dir IAM_paragraph/dan/ To learn more about the various options, use the --help argument or read the documentation.
创建时间:
2024-04-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作