3,506张印地语OCR标注及转写数据【数据堂】

Name: 3,506张印地语OCR标注及转写数据【数据堂】
Creator: shujutang
Published: 2024-05-28 14:23:20
License: 暂无描述

OpenDataLab2024-05-28 更新2024-06-01 收录

下载链接：

https://opendatalab.org.cn/shujutang/shujutang1058

下载链接

链接失效反馈

官方服务：

资源简介：

3,506张印地语OCR标注及转写数据包括自然场景图片2,056张，互联网图像1,103张，文本图像347张。在标注方面，行级内容：行级四边形框标注、行级内容转写；竖列内容：竖列四边形框标注、竖列内容转写。本套印地语OCR标注及转写数据可用于多场景下的印地语识别、印地语拍照翻译等任务

This dataset contains 3,506 Hindi OCR annotated and transcribed samples, consisting of 2,056 natural scene images, 1,103 internet images, and 347 text images. In terms of annotation, line-level content is annotated with quadrilateral bounding boxes and paired with corresponding transcriptions; for vertically arranged column content, quadrilateral bounding box annotation and matching transcriptions are also provided. This Hindi OCR annotated and transcribed dataset can be applied to tasks including multi-scenario Hindi text recognition, Hindi photo translation and other relevant tasks.

提供机构：

shujutang

创建时间：

2024-05-28

搜集汇总

数据集介绍