Handwritten synthetic dataset from the IAM
收藏Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/handwritten-synthetic-dataset-iam/2831142
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was generated employing a technique of randomly crossing out words from the IAM database, utilizing several types of strokes. The ratio of cross-out words to regular words in handwritten documents can vary greatly depending on the document and context. However, typically, the number of cross-out words is small compared with regular words. To ensure a realistic ratio of regular to cross-out words in our synthetic database, 30% of samples from the IAM training set were selected. First, the bounding box of each word in a line was detected. The bounding box covers the core area of the word. Then, at random, a word is crossed out within the core area. Each line contains a randomly struck-out word at a different position. The annotation of these struck-out words was replaced with the symbol #. The folder has:s-s0 imagesSyn-trainset Syn-validsetSyn_IAM_testsetThe transcription files are in the format of Filename, threshold label of handwritten lines-s0-0,157 A # to stop Mr. Gaitskell fromCite the below work if you have used this dataset:"A deep learning approach to handwritten text recognition in the presence of struck-out text"https://ieeexplore.ieee.org/document/8961024
提供机构:
RMIT University, Australia



