MMDocIR/MMDocIR_Evaluation_Dataset
收藏Hugging Face2025-05-28 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/MMDocIR/MMDocIR_Evaluation_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
MMDocIR评估数据集包含313篇平均65.1页的长文档,分为十个主要领域:研究报告、行政与工业、教程与研讨会、学术论文、宣传册、财务报告、指南、政府文件、法律和新闻文章。数据集还包含1658个问题、2107个页面标签和2638个布局标签。数据集的模态分布为:文本占60.4%,图像占18.8%,表格占16.7%,其他模态占4.1%。
The MMDocIR evaluation dataset consists of 313 long documents averaging 65.1 pages, categorized into ten main domains: research reports, administration & industry, tutorials & workshops, academic papers, brochures, financial reports, guidebooks, government documents, laws, and news articles. The dataset also includes 1,658 questions, 2,107 page labels, and 2,638 layout labels. The modality distribution is: Text (60.4%), Image (18.8%), Table (16.7%), and other modalities (4.1%).
提供机构:
MMDocIR



