five

Poseidon-Reasoning-Mini-300K

收藏
魔搭社区2025-12-03 更新2025-07-26 收录
下载链接:
https://modelscope.cn/datasets/prithivMLmods/Poseidon-Reasoning-Mini-300K
下载链接
链接失效反馈
官方服务:
资源简介:
![F6TUuALOWI3VHVcVZ16jl.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XCUMro2YZLuygg0_Z0N65.png) # **Poseidon-Reasoning-Mini-300K** > Poseidon-Reasoning-Mini-300K is a compact, high-quality reasoning dataset designed for advanced tasks in **mathematics**, **coding**, and **science**. This smaller-scale collection maintains the depth and quality of its larger counterparts, with a focus on multi-step and general reasoning—making it ideal for model pretraining, fine-tuning, benchmarking, and STEM educational applications. --- ## Quick Start with Hugging Face Datasets🤗 ```py pip install -U datasets ``` ```py from datasets import load_dataset dataset = load_dataset("prithivMLmods/Poseidon-Reasoning-Mini-300K", split="train") ``` --- ## Overview - **Dataset Name:** Poseidon-Reasoning-Mini-300K - **Curated by:** prithivMLmods - **Size:** ~300,000 entries - **Formats:** `.arrow`, Parquet - **Languages:** English - **License:** Apache-2.0 ## Key Features - **Compact & Rigorous:** Delivers concise, high-quality problems with comprehensive stepwise solutions. - **STEM Coverage:** Prioritizes mathematical, scientific, and coding problems, with a strong emphasis on logic and reasoning tasks. - **Optimized Curation:** Comprises expertly selected entries from larger derivative and open datasets, guaranteeing diversity and consistency. - **Adaptable Scale:** The 300K size is optimal for efficient experimentation and quick benchmarking without sacrificing complexity or depth. ## Dataset Structure Each sample includes: - **problem:** A clear, typically STEM-oriented question or prompt. - **solution:** Step-by-step, reasoning-based explanation or answer. ### Schema Example | Column | Type | Description | |----------|--------|----------------------------| | problem | string | Problem or reasoning prompt | | solution | string | Stepwise explanation/answer | --- ## Data Sources Poseidon-Reasoning-Mini-300K is a carefully curated derivative, sourced from: - **prithivMLmods/Poseidon-Reasoning-5M** - **glaiveai/reasoning-v1-20m** - **prithivMLmods/Open-Omega-Explora-2.5M** - Custom modular dataset contributions by prithivMLmods Each source was selected and filtered to maximize quality, clarity, and coverage of reasoning skills in math, science, and coding. ## Applications This mini dataset is ideal for the following use cases: - Fine-tuning and evaluating LLMs for STEM and general reasoning - Rapid benchmarking for research or educational models - Curriculum design for math, coding, and science toolchains - AI reasoning assessments and diagnostic tasks --- ## Citation If you use this dataset, please cite: ``` Poseidon-Reasoning-Mini-300K by prithivMLmods Derived and curated from: - prithivMLmods/Poseidon-Reasoning-5M - glaiveai/reasoning-v1-20m - prithivMLmods/Open-Omega-Explora-2.5M ``` ## License Distributed under the Apache-2.0 License. Always review underlying source dataset licenses for full compliance.

![F6TUuALOWI3VHVcVZ16jl.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XCUMro2YZLuygg0_Z0N65.png) # **Poseidon-Reasoning-Mini-300K** > Poseidon-Reasoning-Mini-300K是一款轻量化、高质量的推理数据集,专为数学、编程与科学领域的进阶任务打造。这款轻量化数据集保留了其大型同类数据集的深度与质量,核心聚焦多步推理与通用推理任务,十分适配模型预训练、微调、基准测试以及STEM(科学、技术、工程、数学)教育应用场景。 --- ## Hugging Face 数据集🤗快速上手 py pip install -U datasets py from datasets import load_dataset dataset = load_dataset("prithivMLmods/Poseidon-Reasoning-Mini-300K", split="train") --- ## 数据集概览 - **数据集名称:** Poseidon-Reasoning-Mini-300K - **整理方:** prithivMLmods - **数据规模:** 约30万条数据 - **数据格式:** `.arrow`、Parquet - **使用语言:** 英语 - **许可证:** Apache-2.0 ## 核心特性 - **轻量化且严谨:** 提供简洁高质的问题,并附带完整的分步解答。 - **STEM领域覆盖:** 优先收录数学、科学与编程相关问题,重点聚焦逻辑与推理任务。 - **优化精选:** 从大型衍生开源数据集中经专业筛选得到样本,确保数据的多样性与一致性。 - **规模适配:** 30万条的规模可高效支持实验与快速基准测试,同时不会牺牲任务的复杂度与深度。 ## 数据集结构 每条样本包含: - **problem(问题):** 清晰的、通常面向STEM领域的提问或提示。 - **solution(解答):** 基于推理的分步解释或最终答案。 ### Schema示例 | 列名 | 类型 | 说明 | |----------|--------|----------------------------| | problem | 字符串 | 问题或推理提示 | | solution | 字符串 | 分步解释/最终答案 | --- ## 数据来源 Poseidon-Reasoning-Mini-300K是一款经精心整理的衍生数据集,其来源包括: - **prithivMLmods/Poseidon-Reasoning-5M** - **glaiveai/reasoning-v1-20m** - **prithivMLmods/Open-Omega-Explora-2.5M** - prithivMLmods贡献的自定义模块化数据集 所有数据源均经过筛选,以最大化数学、科学与编程领域推理技能相关数据的质量、清晰度与覆盖范围。 ## 应用场景 这款轻量化数据集适配以下应用场景: - 针对STEM领域与通用推理任务的大语言模型(Large Language Model, LLM)微调与评估 - 面向研究或教育类模型的快速基准测试 - 数学、编程与科学工具链的课程设计 - AI推理能力评估与诊断任务 --- ## 引用说明 若使用本数据集,请引用如下内容: Poseidon-Reasoning-Mini-300K by prithivMLmods Derived and curated from: - prithivMLmods/Poseidon-Reasoning-5M - glaiveai/reasoning-v1-20m - prithivMLmods/Open-Omega-Explora-2.5M ## 许可证 本数据集采用Apache-2.0许可证分发。使用前请务必核查底层源数据集的许可证以确保完全合规。
提供机构:
maas
创建时间:
2025-07-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作