five

oNo-1/difficult_problem_dataset_v5_500

收藏
Hugging Face2025-10-23 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/oNo-1/difficult_problem_dataset_v5_500
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: odc-by task_categories: - text-generation language: - en size_categories: - n<1K --- # OverView This dataset is a synthetic dataset created using the Scalable Data Generation (SDG) framework. It is structured for use with a thinking model, and the input and output form a set of questions and answers. # Pipeline of Data Generation 1.Process-based Question Generation - A mechanism for automatically generating questions. 2.Curation + Diversity Filter - A step to ensure quality assurance and diversity, rather than simple generation. 3.Expansion via Evolutionary Methods - Improvement of questions, answers, through an evolutionary strategy (a genetic algorithm–like refinement cycle). 4.Automatic Generation of Reasoning Process - Supplementing reasoning, explanation, and grounding. 5.Finally Storage as a Dataset # Dataset Structure ``` { "input": "question", "output": "reasoning and answer", } ``` # Licence ODC-BY 1.0 This work contains data from Microsoft Academic Graph, available under the ODC-BY 1.0 license. Source: https://aka.ms/msracad
提供机构:
oNo-1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作