five

RxHandBD: A Handwritten Prescription Word Image Dataset

收藏
DataCite Commons2026-03-20 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/dsb5r6vskg/3
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset provides a standardized, ready-to-use collection of 5,578 cropped, handwritten words extracted from physical medical prescriptions. It is explicitly designed to accelerate research and development in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) systems within the healthcare domain. #Two Version of Dataset: 1. Original Raw Data (RxHandBD-Raw.zip) 2. AI Compatible Data (RxHandBD-ML.zip) Dataset Structure & Characteristics To facilitate immediate machine learning application, the dataset has been pre-organized into standard Training and Testing splits (an 80/20 ratio). All images are standardized to a 512x512 pixel resolution to ensure uniformity across neural network input layers. #Total Images: 5,578 (.jpg format) #Vocabulary: 1,559 unique text entries (including generic medicines, pharmaceutical brands, dosage forms, and clinical instructions). #Training Set: 4,463 images (80% of the dataset) accompanied by train_labels.csv. #Testing Set: 1,115 images (20% of the dataset) accompanied by test_labels.csv. #Potential Use Cases Digitizing handwritten prescriptions is a critical step in modernizing healthcare systems, reducing medication dispensing errors, and automating pharmacy workflows. By providing a clean, pre-split, and challenging benchmark of natural physician handwriting, this dataset enables researchers to directly train, validate, and compare deep learning architectures (such as CRNNs or Vision Transformers) for medical text extraction.
提供机构:
Mendeley Data
创建时间:
2026-03-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作