five

带图几何习题数据

收藏
浙江省数据知识产权登记平台2025-09-16 更新2025-09-17 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/180552
下载链接
链接失效反馈
官方服务:
资源简介:
通过对“基于几何实体和关系生成的几何习题问答对和解题过程”数据集的分析、评测,帮助企业发现当前模型在问答推理、图片细节理解、关系推理等方面存在的问题,将该数据集应用于企业大模型的训练,帮助企业提升企业模型在知识问答、逻辑推理、图片关系理解方面的准确性和稳定性。通过平面几何实体及关系知识,实现不同实体及关系的平面几何图生成、问题-答案生成、解题过程生成及校验,具体过程如下: (1)实体及关系采样筛选生成几何图:从当前构造的平面实体(e.g.圆、正方形等 )及关系 (e.g.共线、中点等)中选择多个实体及关系,根据实体属性及关系属性,过滤逻辑错误的实体、关系组合,最终生成符合逻辑的平面几何图及其依赖的实体及关系; (2)构建逻辑依赖图:根据几何图及其依赖的实体及关系,应用所有可能的几何定理,以宽度优先搜索(BFS)方式推导新结论作为新节点,构建可实现的逻辑依赖图; (3)根据目标反向标记路径:根据不同的搜索深度,以逻辑依赖图的一个节点作为“求解答案”,逆向追溯,找到所有推导的前提条件和中间步骤;(4)生成推理路径:根据标记的路径,梳理拓扑排序,使得每一个节点都是(条件、定理、结论)三元组;(5)几何语言转译为自然语言:采用LLM模型+prompt模板,将推理过程润色成解题过程,生成不同难度的习题、答案和解题思路;(6)COT筛选及校正:采用LLM-as-a-judge评估当前转译的解题思路是否存在修改条件、定理和结论,如果有修改,将该cot过程+评估结果作为prompt一部分,重新生成COT;

Through the analysis and evaluation of the dataset titled "Geometric Exercise Question-Answer Pairs and Problem-Solving Processes Generated Based on Geometric Entities and Relations", enterprises can identify existing defects of their current models in aspects including question answering and reasoning, image detail understanding, and relational reasoning. Applying this dataset to the training of enterprise large language models will help enterprises enhance the accuracy and stability of their models in knowledge question answering, logical reasoning, and image relational understanding. Based on planar geometric entity and relation knowledge, this dataset supports the generation of planar geometric graphs for various entities and relations, question-answer pair generation, problem-solving process generation and verification. The specific process is as follows: 1. Geometric graph generation via entity and relation sampling and screening: Select multiple entities and relations from the pre-constructed planar entities (e.g., circles, squares, etc.) and relations (e.g., collinearity, midpoint, etc.), filter out logically inconsistent entity and relation combinations based on entity attributes and relation attributes, and finally generate logically valid planar geometric graphs and their dependent entities and relations. 2. Logical dependency graph construction: Based on the geometric graph and its dependent entities and relations, apply all applicable geometric theorems, and use breadth-first search (BFS) to derive new conclusions as new nodes, thereby constructing a feasible logical dependency graph. 3. Reverse path marking targeting the solution: According to different search depths, take a node in the logical dependency graph as the "target answer", and trace backward to find all derivation preconditions and intermediate steps. 4. Reasoning path generation: Sort the marked paths in topological order, such that each node is a (condition, theorem, conclusion) triple. 5. Geometric language translation to natural language: Use an LLM model and prompt template to polish the reasoning process into a standard problem-solving process, and generate exercises, answers and problem-solving ideas of different difficulty levels. 6. COT screening and correction: Use LLM-as-a-judge to evaluate whether the currently translated problem-solving ideas modify any conditions, theorems or conclusions. If modifications exist, take the COT process and evaluation results as part of the prompt to regenerate the COT.
提供机构:
瓴羊智能科技有限公司
创建时间:
2025-08-04
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集包含1000条带图的初中几何习题数据,涵盖题目、答案和详细解题过程,用于企业大模型训练以提升推理和图片理解能力。数据通过几何实体采样和逻辑依赖图生成,确保习题的逻辑正确性和多样性。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务