five

anabury/Cambridgedataset

收藏
Hugging Face2025-11-06 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/anabury/Cambridgedataset
下载链接
链接失效反馈
官方服务:
资源简介:
剑桥A-Level多模态数据集是从剑桥A-Level过往试卷中提取的多模态数据集,包含文本、表格、图像、图表和数学方程。该数据集旨在用于训练和微调如Qwen-VL、LLaVA、MiniCPM-V、PaliGemma等多模态大型语言模型。数据集支持多模态推理、图表和图像解释、方程解析和符号推理、科学文本解释和逐步推理、OCR和表格理解、考试风格的问题回答训练。它非常适合教育领域的大型语言模型、STEM辅导和指导调优推理模型。

The Cambridge A-Level Multimodal Dataset is a multimodal dataset extracted from past papers of the Cambridge A-Level, containing text, tables, images, diagrams, and mathematical equations. It is designed for training and fine-tuning Multimodal Large Language Models (MLLMs) such as Qwen-VL, LLaVA, MiniCPM-V, PaliGemma, and others. The dataset enables multimodal reasoning, diagram and figure interpretation, equation parsing and symbolic reasoning, scientific text explanation and step-by-step reasoning, OCR and table understanding, and exam-style QA training. It is ideal for educational LLMs, STEM tutors, and instruction-tuned reasoning models.
提供机构:
anabury
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作