five

ScienceOne-AI/S1-DeepResearch-15k

收藏
Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ScienceOne-AI/S1-DeepResearch-15k
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 language: - en - zh tags: - agent size_categories: - 10K<n<100K --- # S1-DeepResearch-15k Dataset ## Overview The **S1-DeepResearch dataset** is a curated collection of approximately **15k samples** designed to improve **deep research capabilities** of large language models. The dataset includes two types of tasks: - Verifiable tasks (labeled as "Closed-ended Multi-hop Resolution") - Open-ended tasks (labeled as "Open-ended Exploration") ## Dataset Composition The dataset consists of **6 task categories**: - **Deep Research Report Writing**: Generates well-structured, evidence-based research reports with clear argumentation, source integration, and traceability. - **Deep Research Instruction Following**: Interprets and executes complex, multi-constraint research instructions across the full pipeline from task definition to result presentation. - **Complex Reasoning**: Covers general-purpose reasoning tasks requiring multi-step inference, logical consistency, and robust problem-solving across diverse domains. - **Skill Use**: Enables dynamic composition and execution of modular skills (e.g., retrieval, analysis, modeling, and visualization) to support end-to-end workflows. - **Multimodal Reasoning**: Handles reasoning over multiple data modalities (e.g., text, images, tables) with integrated understanding and inference. - **Document Understanding & Generation**: Supports parsing and generating structured outputs from diverse document formats, forming a closed loop of understanding, transformation, and generation. ### Data Schema Each sample is structured as follows: - meta: - id: Unique identifier (hash generated from the question) - question: Input query - answer: Reference answer (required for verifiable tasks, optional for generation tasks) - language: Language of the sample (en or zh) - type: Task type label, one of: - Closed-ended Multi-hop Resolution (verifiable tasks) - Open-ended Exploration (open-ended tasks) - messages: - role: One of system / assistant / tool / user - content: Message content ## Related Resources For more details, please refer to the model repository: [S1-DeepResearch](https://github.com/ScienceOne-AI/S1-DeepResearch)
提供机构:
ScienceOne-AI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作