five

Handwritten synthetic dataset from the IAM

收藏
DataCite Commons2025-04-01 更新2024-07-13 收录
下载链接:
https://rmit.figshare.com/articles/dataset/Handwritten_synthetic_dataset_from_the_IAM/24309730/1
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset was generated employing a technique of randomly crossing out words from the IAM database, utilizing several types of strokes. The ratio of cross-out words to regular words in handwritten documents can vary greatly depending on the document and context. However, typically, the number of cross-out words is small compared with regular words. To ensure a realistic ratio of regular to cross-out words in our synthetic database, 30% of samples from the IAM training set were selected. First, the bounding box of each word in a line was detected. The bounding box covers the core area of the word. Then, at random, a word is crossed out within the core area. Each line contains a randomly struck-out word at a different position. The annotation of these struck-out words was replaced with the symbol #.<br> The folder has:<br>s-s0 images<br>Syn-trainset <br>Syn-validset<br>Syn_IAM_testset<br>The transcription files are in the format of <br>Filename, threshold label of handwritten line<br>s-s0-0,157 A # to stop Mr. Gaitskell from<br><br>Cite the below work if you have used this dataset:<br>"A deep learning approach to handwritten text recognition in the presence of struck-out text"<br>https://ieeexplore.ieee.org/document/8961024 <br>
提供机构:
RMIT University
创建时间:
2023-10-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作