Misraj/KITAB_pdf_to_markdown_reviewed
收藏Hugging Face2025-09-24 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/Misraj/KITAB_pdf_to_markdown_reviewed
下载链接
链接失效反馈官方服务:
资源简介:
KITAB PDF to Markdown (Reviewed)是一个经过人工校对的KITAB-Bench PDF-to-Markdown子集,用于阿拉伯文档OCR评估。该数据集修复了地面真实值中的错误,移除了虚假内容,填补了缺失的小字体文本,并保持了原始任务和模式,提供了一个可靠的基准用于模型比较。
KITAB PDF to Markdown (Reviewed) is a manually reviewed and corrected subset of KITAB-Bench PDF-to-Markdown for Arabic document OCR evaluation. This dataset fixes ground-truth errors, removes hallucinations, fills in omitted small-font text, and maintains the original task and schema, providing a reliable benchmark for model comparison.
提供机构:
Misraj



