five

The ICDAR 2003 Informal Competition for the Recognition of On-line Words: The Unipen-ICROW-03 benchmark set - Version 0.0

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7631141
下载链接
链接失效反馈
官方服务:
资源简介:
Proposal for an informal benchmark on word recognition. See for the related ImUnipen collection of word images from on-line vectorial handwriting data: https://zenodo.org/record/1195059 At the time (ICDAR 2003) there was not a lot of interest so the project was not pursued. Lambert Schomaker - February 2023 _______________________________________________________________________________ The ICDAR 2003 Informal Competition for the Recognition of On-line Words:                The Unipen-ICROW-03 benchmark set                 Version 0.0 Lambert Schomaker / International Unipen Foundation The ICROW suite of test files for the recognition of isolated on-line free-style (handprint, mixed and cursive) words has been composed. Different tablets, nationalities and languages are involved. Only the ASCII set is used within word labels. The set contains:    13119 written words      884 unique lexical word entries       72 writers  Language: Dutch, English, Italian. Nationalities: Dutch, Irish, Italian, + mixed The benchmark test is a good estimator for  "walk-up" recognition performance. [Note: some of the writers (NIC-Pc95*.dat set) are present in the UNIPEN R01/V07 distribution, but the actual words are unseen  outside of the Int. Unipen Foundation.] Please note the Copyright notice in the  accompanying file 'Copyright' Wed Jul 16 21:20:10 CEST 2003 Lambert Schomaker --------------------------------------------------------------------------- Instructions for the ICDAR 2003 informal competition for the recognition of on-line words. 1 - unpack the .tgz file 2 - use the UNIPEN files as input for your recognizer. 3 - report, for each writer, a file .res   Example: do-my-recognizer < NIC-Hi93b-marc.dat > NIC-Hi93b-marc.res Format of the .res file. No XML for this moment: simplicity does it. We assume that the recognizer is able to produce a top-10 list of likely words, sorted from most likely to least likely. The output for each word is on a single line. The correct target word is in the first column. <2nd-best word hyp.> ... <10th-best word hyp> <2nd-best word hyp.> ... <10th-best word hyp> Example with two words: summertime   slumbertime slipknot summertime somatome spumante simulative semitone schoolmate sermonette semimature Aberdeen     Adamson Aberdeen Addison Armageddon Abyssinian Araban Albanian Alabamian Abraham Adelaide 4 - pack the  *.res files in a .tgz or .zip file and send them     to schomaker@ai.rug.nl     All *.dat files need to be processed. LS.

在线单词识别非正式基准测试提案。相关参考资料可查看来自在线矢量手写数据的单词图像数据集ImUnipen:https://zenodo.org/record/1195059 2003年国际文档分析与识别会议(ICDAR 2003)举办期间,该项目并未获得足够关注度,因此未能继续推进。 兰伯特·朔马克(Lambert Schomaker)——2023年2月 _______________________________________________________________________________ 2003年国际文档分析与识别会议(ICDAR 2003)在线单词识别非正式竞赛:Unipen-ICROW-03基准数据集 版本0.0 兰伯特·朔马克 / 国际Unipen基金会 本次构建了ICROW测试数据集,用于孤立在线自由手写(印刷体、混合体与草书体)单词的识别。该数据集涵盖不同书写板、不同国籍与语言的手写样本,单词标注仅采用ASCII字符集。 该数据集包含: 13119条手写单词样本 884个唯一词汇词条 72名书写者 覆盖语言:荷兰语、英语、意大利语;涉及国籍:荷兰、爱尔兰、意大利,及混合国籍群体 该基准测试可作为即开即用式识别系统性能的有效评估指标。 [注:部分书写者的样本(NIC-Pc95*.dat数据集)包含于UNIPEN R01/V07分发包中,但该数据集内的实际单词样本仅对国际Unipen基金会内部开放,外部无法获取。] 请留意随附的"Copyright"文件中的版权声明。 2003年7月16日 星期三 中欧夏令时间21:20:10 兰伯特·朔马克 --------------------------------------------------------------------------- 2003年国际文档分析与识别会议在线单词识别非正式竞赛参赛指南 1. 解压.tgz压缩包 2. 将UNIPEN格式文件作为识别器的输入数据 3. 为每位书写者生成对应的.res结果文件 示例:do-my-recognizer < NIC-Hi93b-marc.dat > NIC-Hi93b-marc.res .res结果文件格式说明 暂不使用XML格式,以简洁性为核心原则。 假设识别器可生成按置信度从高到低排序的Top-10候选单词列表。每个单词的识别结果独占一行,正确目标单词位于第一列。 <第2候选单词> … <第10候选单词> <第2候选单词> … <第10候选单词> 双单词示例: summertime slumbertime slipknot summertime somatome spumante simulative semitone schoolmate sermonette semimature Aberdeen Adamson Aberdeen Addison Armageddon Abyssinian Araban Albanian Alabamian Abraham Adelaide 4. 将所有.res结果文件打包为.tgz或.zip压缩包,并发送至schomaker@ai.rug.nl。请注意,所有.dat输入文件均需完成处理。 LS.
创建时间:
2024-07-12
二维码
社区交流群
二维码
科研交流群
商业服务