QwQ-LongCoT-Verified-130K

Name: QwQ-LongCoT-Verified-130K
Creator: maas
Published: 2026-01-02 16:19:27
License: 暂无描述

魔搭社区2026-01-02 更新2025-01-04 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/QwQ-LongCoT-Verified-130K

下载链接

链接失效反馈

官方服务：

资源简介：

Original Dataset: [amphora/QwQ-LongCoT-130K](https://huggingface.co/datasets/amphora/QwQ-LongCoT-130K) QwQ 32B Preview isn't perfect :) **Note**: Around 5-7% of the processed data might be incorrectly labeled as "unverified" because QwQ's output isn't *exactly* the same as the original solution from NuminaMathCoT. I believe this can be solved with another round of processing with a smarter model but Qwen 2.5 3B Instruct is good enough to check if the solution is exactly the same. Magpie data is also "unverified" and has an empty "solution" column. | Subset | Information | Rows | |---------------------|---------------------------------------------------------|-----------| | **default** | Includes all the data from the original dataset. | 133k | | **magpie** | Just the Magpie generated data. | 43k | | **numina-math-only**| Only the problems coming from the NuminaMathCoT dataset.| 90.1k | | **verified** | Keeps only the verified data. | 64.6k |

原始数据集：[amphora/QwQ-LongCoT-130K](https://huggingface.co/datasets/amphora/QwQ-LongCoT-130K) QwQ 32B 预览版并非完美无缺 :) **注**：由于QwQ的输出与NuminaMathCoT的原始解题方案并非完全一致，约5%至7%的已处理数据可能被错误标记为“未验证”。笔者认为可通过使用更智能的模型开展第二轮处理以解决该问题，但Qwen 2.5 3B Instruct已足够用于校验解题方案是否完全匹配。Magpie数据同样处于“未验证”状态，且其"solution"（解题方案）字段为空。 | 子集名称 | 信息说明 | 数据行数 | |---------------------|---------------------------------------------------------|-----------| | **default** | 包含原始数据集的全部数据。 | 13.3万 | | **magpie** | 仅包含Magpie生成的数据。 | 4.3万 | | **numina-math-only**| 仅包含源自NuminaMathCoT数据集的题目。| 9.01万 | | **verified** | 仅保留已验证的数据。 | 6.46万 |

提供机构：

maas

创建时间：

2024-12-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集