Datasets for "Reading Order Independent Metrics for Information Extraction in Handwritten Documents"

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/11083656

下载链接

链接失效反馈

官方服务：

资源简介：

This repository includes the five datasets used for our paper entitled Reading Order Independent Metrics for Information Extraction in Handwritten Documents, in which we compare various metrics to evaluate end-to-end information extraction from scanned documents. Datasets Five datasets are released following the BIO format: IAM Simara POPP Esposalles French Military Records For each dataset, we provide the following data (on test sets): Ground truth annotations (gt/) Automatic predictions (dan/) Automatic predictions with entities appearing in random order (dan_shuffled/) The data is organized as follows: ├── Dataset name/│ ├── gt/│ ├── dan/│ └── dan_shuffled/ Metrics To install the ie-eval package, run pip install ie-eval. To compute all metrics on a specific dataset, run:ie-eval all --label-dir IAM_paragraph/gt/ --prediction-dir IAM_paragraph/dan/ To learn more about the various options, use the --help argument or read the documentation.

创建时间：

2024-04-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集