Image-to-Text Bilingual Dataset from Medical Prescriptions

Name: Image-to-Text Bilingual Dataset from Medical Prescriptions
Creator: Bangabandhu Sheikh Mujibur Rahman Digital University
License: 暂无描述

Mendeley Data2026-04-09 收录

下载链接：

https://data.mendeley.com/datasets/tg2hm7n2bs/1

下载链接

链接失效反馈

官方服务：

资源简介：

The 600 handwritten images of medical prescriptions in the dataset are annotated in both Bangla and English, making it suitable for OCR, NLP, machine translation, and machine learning-based healthcare research. `annotations.csv` is a CSV file containing individual rows for each prescription and their respective Bangla and English texts. Those prescriptions that were clear and full were selected and the name and other identifying features of the patients blacked out. The images were obtained from hospitals, clinics and pharmacies in Gazipur, Bangladesh. The intention is research and education. Please see README.pdf within the dataset for more information on using it.

本数据集包含600张手写医疗处方图像，均采用孟加拉语与英语进行标注，适用于光学字符识别（Optical Character Recognition, OCR）、自然语言处理（Natural Language Processing, NLP）、机器翻译以及基于机器学习的医疗健康研究。`annotations.csv`为CSV格式文件，其中每一行对应一张处方，分别存储其对应的孟加拉语与英语文本内容。本数据集筛选了字迹清晰、内容完整的处方，并对患者姓名及其他可识别身份的信息进行了涂黑处理。所有处方图像采集自孟加拉国加齐普尔（Gazipur）的医院、诊所与药房，仅用于研究与教育用途。如需了解该数据集的更多使用说明，请查阅数据集内的README.pdf文件。

提供机构：

Bangabandhu Sheikh Mujibur Rahman Digital University

5,000+

优质数据集

54 个

任务类型

进入经典数据集