five

Bibek130/IAM-line

收藏
Hugging Face2025-12-04 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Bibek130/IAM-line
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en task_categories: - image-to-text pretty_name: IAM-line dataset_info: features: - name: image dtype: image - name: text dtype: string splits: - name: train num_examples: 6482 - name: validation num_examples: 976 - name: test num_examples: 2915 dataset_size: 10373 tags: - atr - htr - ocr - modern - handwritten --- # IAM - line level ## Table of Contents - [IAM - line level](#iam-line-level) - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) ## Dataset Description - **Homepage:** [IAM Handwriting Database](https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) - **Paper:** [The IAM-database: an English sentence database for offline handwriting recognition](https://doi.org/10.1007/s100320200071) - **Point of Contact:** [TEKLIA](https://teklia.com) ## Dataset Summary The IAM Handwriting Database contains forms of handwritten English text which can be used to train and test handwritten text recognizers and to perform writer identification and verification experiments. Note that all images are resized to a fixed height of 128 pixels. ### Languages All the documents in the dataset are written in English. ## Dataset Structure ### Data Instances ``` { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=2467x128 at 0x1A800E8E190, 'text': 'put down a resolution on the subject' } ``` ### Data Fields - `image`: a PIL.Image.Image object containing the image. Note that when accessing the image column (using dataset[0]["image"]), the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the "image" column, i.e. dataset[0]["image"] should always be preferred over dataset["image"][0]. - `text`: the label transcription of the image.
提供机构:
Bibek130
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作