five

QwQ-LongCoT-Verified-130K

收藏
魔搭社区2026-01-02 更新2025-01-04 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/QwQ-LongCoT-Verified-130K
下载链接
链接失效反馈
官方服务:
资源简介:
Original Dataset: [amphora/QwQ-LongCoT-130K](https://huggingface.co/datasets/amphora/QwQ-LongCoT-130K) QwQ 32B Preview isn't perfect :) **Note**: Around 5-7% of the processed data might be incorrectly labeled as "unverified" because QwQ's output isn't *exactly* the same as the original solution from NuminaMathCoT. I believe this can be solved with another round of processing with a smarter model but Qwen 2.5 3B Instruct is good enough to check if the solution is exactly the same. Magpie data is also "unverified" and has an empty "solution" column. | Subset | Information | Rows | |---------------------|---------------------------------------------------------|-----------| | **default** | Includes all the data from the original dataset. | 133k | | **magpie** | Just the Magpie generated data. | 43k | | **numina-math-only**| Only the problems coming from the NuminaMathCoT dataset.| 90.1k | | **verified** | Keeps only the verified data. | 64.6k |

原始数据集:[amphora/QwQ-LongCoT-130K](https://huggingface.co/datasets/amphora/QwQ-LongCoT-130K) QwQ 32B 预览版并非完美无缺 :) **注**:由于QwQ的输出与NuminaMathCoT的原始解题方案并非完全一致,约5%至7%的已处理数据可能被错误标记为“未验证”。笔者认为可通过使用更智能的模型开展第二轮处理以解决该问题,但Qwen 2.5 3B Instruct已足够用于校验解题方案是否完全匹配。Magpie数据同样处于“未验证”状态,且其"solution"(解题方案)字段为空。 | 子集名称 | 信息说明 | 数据行数 | |---------------------|---------------------------------------------------------|-----------| | **default** | 包含原始数据集的全部数据。 | 13.3万 | | **magpie** | 仅包含Magpie生成的数据。 | 4.3万 | | **numina-math-only**| 仅包含源自NuminaMathCoT数据集的题目。| 9.01万 | | **verified** | 仅保留已验证的数据。 | 6.46万 |
提供机构:
maas
创建时间:
2024-12-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作