数据堂—39,993张互联网图像OCR数据
收藏魔搭社区2025-12-05 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/DatatangBeijing/39993Images-OCRDataofInternetImage
下载链接
链接失效反馈官方服务:
资源简介:
39,993张互联网图像OCR数据采集场景包括影视字幕、广告、手机截屏、漫画、表情包、海报、杂志封面等。语言分布为中文和英文(少量)。在标注方面,对互联网图像进行行级矩形框标注、行级内容转写(少量数据为列级矩形框标注、列级内容转写) 。本套互联网图像OCR数据可用于多种互联网图像OCR任务。
This dataset includes 39,993 internet images for OCR tasks, with collection scenarios spanning film and television subtitles, advertisements, mobile phone screenshots, comics, memes, posters, magazine covers and other common internet text-containing image types. The languages covered are Chinese and a small amount of English. Regarding annotation, most of the images are annotated with line-level rectangular bounding boxes and their corresponding line-level content transcriptions, while a small portion of the data adopts column-level rectangular bounding boxes and column-level content transcriptions. This set of internet image OCR data can be applied to multiple internet image OCR-related tasks.
提供机构:
maas
创建时间:
2024-04-25
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含39,993张互联网图像,用于测试多种OCR任务,涵盖电影字幕、广告、移动截图等场景,语言以中文为主,少量英文。标注采用行级矩形框和内容转录,图像格式为.jpg,标注格式为.json,检测和转录准确率均不低于97%。
以上内容由遇见数据集搜集并总结生成



