bitter-aloe/focus-raw-ocr
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/bitter-aloe/focus-raw-ocr
下载链接
链接失效反馈官方服务:
资源简介:
focus-raw-ocr数据集包含《FOCUS on Political Repression in Southern Africa》新闻公告的页面级校正OCR输出。该新闻公告由国际防御与援助基金从1975年开始发布。每一行数据代表一个文档的一页,包括其渲染的页面图像和手动验证的dots.ocr布局JSON(包含边界框、类别、文本和阅读顺序)。数据集适用于图像到文本、对象检测等任务,主要用于OCR、布局分析、历史文档和南部非洲人权研究。
Page-level corrected dots.ocr output for the *FOCUS on Political Repression in Southern Africa* news bulletin, published by the International Defence & Aid Fund from 1975 onwards. Each row is one page of one document, joined to its rendered page image and the manually-validated dots.ocr layout JSON (bounding boxes + categories + text + reading order).
提供机构:
bitter-aloe



