Datasets for "Reading Order Independent Metrics for Information Extraction in Handwritten Documents"
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/11083656
下载链接
链接失效反馈官方服务:
资源简介:
This repository includes the five datasets used for our paper entitled Reading Order Independent Metrics for Information Extraction in Handwritten Documents, in which we compare various metrics to evaluate end-to-end information extraction from scanned documents.
Datasets
Five datasets are released following the BIO format:
IAM
Simara
POPP
Esposalles
French Military Records
For each dataset, we provide the following data (on test sets):
Ground truth annotations (gt/)
Automatic predictions (dan/)
Automatic predictions with entities appearing in random order (dan_shuffled/)
The data is organized as follows:
├── Dataset name/│ ├── gt/│ ├── dan/│ └── dan_shuffled/
Metrics
To install the ie-eval package, run pip install ie-eval.
To compute all metrics on a specific dataset, run:ie-eval all --label-dir IAM_paragraph/gt/ --prediction-dir IAM_paragraph/dan/
To learn more about the various options, use the --help argument or read the documentation.
创建时间:
2024-04-29



