five

Wisdom-math/wisdom-math

收藏
Hugging Face2024-12-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Wisdom-math/wisdom-math
下载链接
链接失效反馈
官方服务:
资源简介:
WISDOM数据集通过渐进式课程合成方法生成数学推理问题,并利用Deepseek Coder V2和GPT-4o进行回答生成。该数据集旨在通过从易到难的课程学习,逐步合成高质量的CoT(Chain-of-Thought)数据,以提升大语言模型在数学推理任务中的表现。训练过程中使用了Alpaca模板,并采用了10-gram哈希去重方法来避免数据污染。训练在88个NVIDIA A800 GPU上进行,使用了AdamW优化器和余弦学习率调度。

The WISDOM dataset generates mathematical reasoning questions through a progressive curriculum synthesis method and utilizes Deepseek Coder V2 and GPT-4o to generate responses. The dataset aims to gradually synthesize high-quality CoT (Chain-of-Thought) data from easy to hard through curriculum learning, enhancing the performance of large language models in mathematical reasoning tasks. The training process uses the Alpaca template and employs a 10-gram hash deduplication method to avoid data contamination. Training is conducted on 88 NVIDIA A800 GPUs, using the AdamW optimizer and a cosine learning rate schedule.
提供机构:
Wisdom-math
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
The 'Wisdom-math/wisdom-math' dataset is a comprehensive collection of mathematical problems and solutions, designed to improve the mathematical reasoning of large language models through progressive curriculum learning. It includes a wide range of topics and is formatted in JSON, making it suitable for training and evaluating models on complex mathematical tasks.
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作