InternScience/SGI-DeepResearch
收藏Hugging Face2025-12-30 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/InternScience/SGI-DeepResearch
下载链接
链接失效反馈官方服务:
资源简介:
SGI-Bench是一个用于评估大型语言模型(LLMs)科学通用智能(SGI)的数据集,通过科学家对齐的工作流程进行。该数据集涵盖10个学科,包含1000多个专家策划的样本,灵感来源于《科学》杂志的125个重大问题。数据集任务分为科学探究的四个阶段:深思熟虑、构思、行动和感知。数据集结构包括idx、question、steps、answer、discipline、direction和type等特征。包含一个测试分割,共318个样本,采用MIT许可证。数据集标签包括化学、生物学、气候、科学、基准测试等,适用于问答和文本生成等任务。README还详细介绍了数据集的构建、评估框架以及快速使用指南。
SGI-Bench is a dataset designed to evaluate the Scientific General Intelligence (SGI) of large language models (LLMs) through scientist-aligned workflows. The dataset spans 10 disciplines and includes over 1,000 expert-curated samples inspired by Sciences 125 Big Questions. It features tasks across four stages of scientific inquiry: Deliberation, Conception, Action, and Perception. The dataset is structured with features like idx, question, steps, answer, discipline, direction, and type. It includes a test split with 318 examples and is licensed under MIT. The dataset is tagged with categories like chemistry, biology, climate, science, benchmark, etc., and is intended for tasks like question-answering and text-generation. The README also provides detailed information about the datasets construction, evaluation framework, and quick start instructions for using the dataset.
提供机构:
InternScience



