chitanda/mathscale4o-800k
收藏Hugging Face2025-02-06 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/chitanda/mathscale4o-800k
下载链接
链接失效反馈官方服务:
资源简介:
SFT数据集,用于复现论文《Preference Optimization for Reasoning with Pseudo Feedback》的实验,包含经过去噪处理带答案框和不带答案框的SFT样本。
The SFT dataset for reproducing the experiments in the paper Preference Optimization for Reasoning with Pseudo Feedback, including denoised SFT samples with and without answer boxes.
提供机构:
chitanda



