five

ThaiOCRBench

收藏
魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/scb10x/ThaiOCRBench
下载链接
链接失效反馈
官方服务:
资源简介:
# ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai **ThaiOCRBench** is the first comprehensive benchmark for evaluating vision-language models (VLMs) on Thai text-rich visual understanding tasks. Inspired by OCRBench v2, it contains **2,808 human-annotated samples** across **13 diverse tasks**, including table parsing, chart understanding, full-page OCR, key information extraction, and visual question answering. The benchmark enables **standardized zero-shot evaluation** for both proprietary and open-source models, revealing significant performance gaps and paving the way for document understanding in low-resource languages. 🚀 **Our paper _ThaiOCRBench_ has been accepted to the IJCNLP-AACL 2025 Main Conference!** 👉 **[📄 Read the Paper](https://arxiv.org/abs/2511.04479)** 👉 **[💻 GitHub Repository](https://github.com/scb-10x/ThaiOCRBench)** ## 📊 Dataset Statistics | Task Type | Number of Samples | |-------------------------------|-------------------| | Text Recognition | 333 | | Table Parsing | 193 | | Full-page OCR | 197 | | Chart Parsing | 200 | | Key Information Extraction | 201 | | Diagram VQA | 204 | | Fine-grained Text Recognition | 206 | | Handwritten Content Extraction| 209 | | Key Information Mapping | 209 | | Document Parsing | 211 | | Infographics VQA | 213 | | Document Classification | 215 | | Cognition VQA | 217 | | **Total** | **2,808** | ## 🧠 Performance of VLMs on ThaiOCRBench <p align="center"> <img src="https://raw.githubusercontent.com/scb-10x/ThaiOCRBench/main/pics/thaiocrbench_eval.png" width="70%" height="60%"> </p> ## 📘 Citation If you use ThaiOCRBench in your research or applications, please cite our work: ``` @misc{nonesung2025thaiocrbenchtaskdiversebenchmarkvisionlanguage, title={ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai}, author={Surapon Nonesung and Teetouch Jaknamon and Sirinya Chaiophat and Natapong Nitarach and Chanakan Wittayasakpan and Warit Sirichotedumrong and Adisai Na-Thalang and Kunat Pipatanakul}, year={2025}, eprint={2511.04479}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2511.04479}, } ```

# ThaiOCRBench:面向泰语视觉语言理解的多任务基准数据集 **泰语OCR基准数据集(ThaiOCRBench)**是首个用于评估泰语富文本视觉理解任务的视觉语言模型(Vision-Language Models, VLMs)基准数据集。 该基准受OCRBench v2启发,涵盖13类多样化任务,包含**2808个人工标注样本**,覆盖表格解析、图表理解、全页OCR、关键信息提取以及视觉问答等任务。 该基准支持针对闭源与开源模型的标准化零样本(Zero-shot)评估,可揭示显著的性能差距,为低资源语言的文档理解研究铺平道路。 🚀 **我们的论文《面向泰语视觉语言理解的多任务基准数据集(ThaiOCRBench)》已被IJCNLP-AACL 2025主会议收录!** 👉 **[📄 阅读论文](https://arxiv.org/abs/2511.04479)** 👉 **[💻 GitHub 仓库](https://github.com/scb-10x/ThaiOCRBench)** ## 📊 数据集统计 | 任务类型 | 样本数量 | |-------------------------------|----------| | 文本识别 | 333 | | 表格解析 | 193 | | 全页OCR | 197 | | 图表解析 | 200 | | 关键信息提取 | 201 | | 图表视觉问答 | 204 | | 细粒度文本识别 | 206 | | 手写内容提取 | 209 | | 关键信息映射 | 209 | | 文档解析 | 211 | | 信息图表视觉问答 | 213 | | 文档分类 | 215 | | 认知视觉问答 | 217 | | **总计** | **2808** | ## 🧠 基于ThaiOCRBench的视觉语言模型性能评估 <p align="center"> <img src="https://raw.githubusercontent.com/scb-10x/ThaiOCRBench/main/pics/thaiocrbench_eval.png" width="70%" height="60%"> </p> ## 📘 引用 如果您在研究或应用中使用ThaiOCRBench,请引用我们的工作: @misc{nonesung2025thaiocrbenchtaskdiversebenchmarkvisionlanguage, title={ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai}, author={Surapon Nonesung and Teetouch Jaknamon and Sirinya Chaiophat and Natapong Nitarach and Chanakan Wittayasakpan and Warit Sirichotedumrong and Adisai Na-Thalang and Kunat Pipatanakul}, year={2025}, eprint={2511.04479}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2511.04479}, }
提供机构:
maas
创建时间:
2025-12-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作