Handwritten synthetic dataset from the IAM

Name: Handwritten synthetic dataset from the IAM
Creator: RMIT University
Published: 2025-04-01 06:44:28
License: 暂无描述

DataCite Commons2025-04-01 更新2024-07-13 收录

下载链接：

https://rmit.figshare.com/articles/dataset/Handwritten_synthetic_dataset_from_the_IAM/24309730/1

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset was generated employing a technique of randomly crossing out words from the IAM database, utilizing several types of strokes. The ratio of cross-out words to regular words in handwritten documents can vary greatly depending on the document and context. However, typically, the number of cross-out words is small compared with regular words. To ensure a realistic ratio of regular to cross-out words in our synthetic database, 30% of samples from the IAM training set were selected. First, the bounding box of each word in a line was detected. The bounding box covers the core area of the word. Then, at random, a word is crossed out within the core area. Each line contains a randomly struck-out word at a different position. The annotation of these struck-out words was replaced with the symbol #. The folder has: s-s0 images Syn-trainset Syn-validset Syn_IAM_testset The transcription files are in the format of Filename, threshold label of handwritten line s-s0-0,157 A # to stop Mr. Gaitskell from Cite the below work if you have used this dataset: "A deep learning approach to handwritten text recognition in the presence of struck-out text" https://ieeexplore.ieee.org/document/8961024

提供机构：

RMIT University

创建时间：

2023-10-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集