Wisdom-math/wisdom-math
收藏Hugging Face2024-12-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Wisdom-math/wisdom-math
下载链接
链接失效反馈官方服务:
资源简介:
WISDOM数据集通过渐进式课程合成方法生成数学推理问题,并利用Deepseek Coder V2和GPT-4o进行回答生成。该数据集旨在通过从易到难的课程学习,逐步合成高质量的CoT(Chain-of-Thought)数据,以提升大语言模型在数学推理任务中的表现。训练过程中使用了Alpaca模板,并采用了10-gram哈希去重方法来避免数据污染。训练在88个NVIDIA A800 GPU上进行,使用了AdamW优化器和余弦学习率调度。
The WISDOM dataset generates mathematical reasoning questions through a progressive curriculum synthesis method and utilizes Deepseek Coder V2 and GPT-4o to generate responses. The dataset aims to gradually synthesize high-quality CoT (Chain-of-Thought) data from easy to hard through curriculum learning, enhancing the performance of large language models in mathematical reasoning tasks. The training process uses the Alpaca template and employs a 10-gram hash deduplication method to avoid data contamination. Training is conducted on 88 NVIDIA A800 GPUs, using the AdamW optimizer and a cosine learning rate schedule.
提供机构:
Wisdom-math
搜集汇总
数据集介绍

背景与挑战
背景概述
The 'Wisdom-math/wisdom-math' dataset is a comprehensive collection of mathematical problems and solutions, designed to improve the mathematical reasoning of large language models through progressive curriculum learning. It includes a wide range of topics and is formatted in JSON, making it suitable for training and evaluating models on complex mathematical tasks.
以上内容由遇见数据集搜集并总结生成



