数据堂—3,506张印地语OCR标注及转写数据

Name: 数据堂—3,506张印地语OCR标注及转写数据
Creator: maas
Published: 2025-11-12 16:14:51
License: 暂无描述

魔搭社区2025-11-12 更新2024-05-15 收录

下载链接：

https://modelscope.cn/datasets/DatatangBeijing/3506HindiOCRImagesData-ImageswithAnnotationandTranscription

下载链接

链接失效反馈

官方服务：

资源简介：

3,506张印地语OCR标注及转写数据包括自然场景图片2,056张，互联网图像1,103张，文本图像347张。在标注方面，行级内容：行级四边形框标注、行级内容转写；竖列内容：竖列四边形框标注、竖列内容转写。本套印地语OCR标注及转写数据可用于多场景下的印地语识别、印地语拍照翻译等任务

This Hindi OCR annotated and transcribed dataset consists of 3,506 images in total, which are divided into three categories: 2,056 natural scene images, 1,103 internet images, and 347 text images. In terms of annotation, for line-level content, it provides quadrilateral bounding box annotation and line-level content transcription; for vertical column-level content, it provides quadrilateral bounding box annotation and vertical column-level content transcription. This dataset can be applied to multi-scenario Hindi text recognition, Hindi photo translation and other related tasks.

提供机构：

maas

创建时间：

2024-05-07

搜集汇总

数据集介绍