ScienceOne-AI/S1-DeepResearch-15k
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ScienceOne-AI/S1-DeepResearch-15k
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
- zh
tags:
- agent
size_categories:
- 10K<n<100K
---
# S1-DeepResearch-15k Dataset
## Overview
The **S1-DeepResearch dataset** is a curated collection of approximately **15k samples** designed to improve **deep research capabilities** of large language models.
The dataset includes two types of tasks:
- Verifiable tasks (labeled as "Closed-ended Multi-hop Resolution")
- Open-ended tasks (labeled as "Open-ended Exploration")
## Dataset Composition
The dataset consists of **6 task categories**:
- **Deep Research Report Writing**: Generates well-structured, evidence-based research reports with clear argumentation, source integration, and traceability.
- **Deep Research Instruction Following**: Interprets and executes complex, multi-constraint research instructions across the full pipeline from task definition to result presentation.
- **Complex Reasoning**: Covers general-purpose reasoning tasks requiring multi-step inference, logical consistency, and robust problem-solving across diverse domains.
- **Skill Use**: Enables dynamic composition and execution of modular skills (e.g., retrieval, analysis, modeling, and visualization) to support end-to-end workflows.
- **Multimodal Reasoning**: Handles reasoning over multiple data modalities (e.g., text, images, tables) with integrated understanding and inference.
- **Document Understanding & Generation**: Supports parsing and generating structured outputs from diverse document formats, forming a closed loop of understanding, transformation, and generation.
### Data Schema
Each sample is structured as follows:
- meta:
- id: Unique identifier (hash generated from the question)
- question: Input query
- answer: Reference answer (required for verifiable tasks, optional for generation tasks)
- language: Language of the sample (en or zh)
- type: Task type label, one of:
- Closed-ended Multi-hop Resolution (verifiable tasks)
- Open-ended Exploration (open-ended tasks)
- messages:
- role: One of system / assistant / tool / user
- content: Message content
## Related Resources
For more details, please refer to the model repository: [S1-DeepResearch](https://github.com/ScienceOne-AI/S1-DeepResearch)
提供机构:
ScienceOne-AI



