five

高考录取表格分割

收藏
魔搭社区2025-11-24 更新2025-11-15 收录
下载链接:
https://modelscope.cn/datasets/wqzh117/Gaokao-Admission-Chart-Segment
下载链接
链接失效反馈
官方服务:
资源简介:
全国各省市的招生和录取书籍中的表格,有些包含2列或者3列数据。在高考志愿填报业务,我们需要进行OCR或者使用VLM进行解析,将数据存入数据库。此时如果能把每一列分开处理,这样就可以避免数据串行的错误,表格解析效果会好很多。因此,作者自行收集图片并使用Labelimg标注工具标注出了1150+张图片中的表格,并且使用了YOLO11训练了3个模型(nano,small,medium)进行检测

Tables from enrollment and admission books across all provinces and municipalities in China, some of which contain 2 or 3 columns of data. For the college application service during the National College Entrance Examination (Gaokao), we need to perform Optical Character Recognition (OCR) or use Vision-Language Models (VLMs) to parse the tables and store the extracted data into databases. Processing each column individually can avoid errors caused by serial processing of the entire table, thus significantly improving the performance of table parsing. Therefore, the authors independently collected image datasets, annotated tables in more than 1150 images using the LabelImg annotation tool, and trained three models (nano, small, medium) with YOLO11 for table detection.
提供机构:
maas
创建时间:
2025-11-12
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集旨在通过分割高考录取表格中的多列数据,提升OCR或视觉语言模型解析表格的准确性。它包含超过1150张标注图像及对应的YOLO格式标签,并基于YOLO11模型进行了训练。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务