five

InternScience/SGI-DeepResearch

收藏
Hugging Face2025-12-30 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/InternScience/SGI-DeepResearch
下载链接
链接失效反馈
官方服务:
资源简介:
SGI-Bench是一个用于评估大型语言模型(LLMs)科学通用智能(SGI)的数据集,通过科学家对齐的工作流程进行。该数据集涵盖10个学科,包含1000多个专家策划的样本,灵感来源于《科学》杂志的125个重大问题。数据集任务分为科学探究的四个阶段:深思熟虑、构思、行动和感知。数据集结构包括idx、question、steps、answer、discipline、direction和type等特征。包含一个测试分割,共318个样本,采用MIT许可证。数据集标签包括化学、生物学、气候、科学、基准测试等,适用于问答和文本生成等任务。README还详细介绍了数据集的构建、评估框架以及快速使用指南。

SGI-Bench is a dataset designed to evaluate the Scientific General Intelligence (SGI) of large language models (LLMs) through scientist-aligned workflows. The dataset spans 10 disciplines and includes over 1,000 expert-curated samples inspired by Sciences 125 Big Questions. It features tasks across four stages of scientific inquiry: Deliberation, Conception, Action, and Perception. The dataset is structured with features like idx, question, steps, answer, discipline, direction, and type. It includes a test split with 318 examples and is licensed under MIT. The dataset is tagged with categories like chemistry, biology, climate, science, benchmark, etc., and is intended for tasks like question-answering and text-generation. The README also provides detailed information about the datasets construction, evaluation framework, and quick start instructions for using the dataset.
提供机构:
InternScience
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作