ScienceOne-AI/S1-DeepResearch-15k

Name: ScienceOne-AI/S1-DeepResearch-15k
Creator: ScienceOne-AI
Published: 2026-04-09 06:33:50
License: 暂无描述

Hugging Face2026-04-09 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/ScienceOne-AI/S1-DeepResearch-15k

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en - zh tags: - agent size_categories: - 10K<n<100K --- # S1-DeepResearch-15k Dataset ## Overview The **S1-DeepResearch dataset** is a curated collection of approximately **15k samples** designed to improve **deep research capabilities** of large language models. The dataset includes two types of tasks: - Verifiable tasks (labeled as "Closed-ended Multi-hop Resolution") - Open-ended tasks (labeled as "Open-ended Exploration") ## Dataset Composition The dataset consists of **6 task categories**: - **Deep Research Report Writing**: Generates well-structured, evidence-based research reports with clear argumentation, source integration, and traceability. - **Deep Research Instruction Following**: Interprets and executes complex, multi-constraint research instructions across the full pipeline from task definition to result presentation. - **Complex Reasoning**: Covers general-purpose reasoning tasks requiring multi-step inference, logical consistency, and robust problem-solving across diverse domains. - **Skill Use**: Enables dynamic composition and execution of modular skills (e.g., retrieval, analysis, modeling, and visualization) to support end-to-end workflows. - **Multimodal Reasoning**: Handles reasoning over multiple data modalities (e.g., text, images, tables) with integrated understanding and inference. - **Document Understanding & Generation**: Supports parsing and generating structured outputs from diverse document formats, forming a closed loop of understanding, transformation, and generation. ### Data Schema Each sample is structured as follows: - meta: - id: Unique identifier (hash generated from the question) - question: Input query - answer: Reference answer (required for verifiable tasks, optional for generation tasks) - language: Language of the sample (en or zh) - type: Task type label, one of: - Closed-ended Multi-hop Resolution (verifiable tasks) - Open-ended Exploration (open-ended tasks) - messages: - role: One of system / assistant / tool / user - content: Message content ## Related Resources For more details, please refer to the model repository: [S1-DeepResearch](https://github.com/ScienceOne-AI/S1-DeepResearch)

提供机构：

ScienceOne-AI

5,000+

优质数据集

54 个

任务类型

进入经典数据集