InternScience/SFE
收藏Hugging Face2025-12-24 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/InternScience/SFE
下载链接
链接失效反馈官方服务:
资源简介:
科学家首次考试(SFE)是一个旨在通过三个认知层次(科学信号感知、科学属性理解和科学比较推理)全面评估多模态大语言模型(MLLMs)科学认知能力的基准测试。数据集包含830个专家验证的视觉问答对(VQA pairs),涵盖五个高价值学科(天文学、化学、地球科学、生命科学和材料科学)的66个多模态任务。所有任务均为双语(英语和中文),以支持广泛的访问。这些任务不仅需要深入理解领域特定的知识和数据分析技能,还能显著提高研究效率并促进社会进步的进展。
The Scientists First Exam (SFE) benchmark is designed to comprehensively evaluate the scientific cognitive capabilities of Multimodal Large Language Models (MLLMs) through three cognitive levels: Scientific Signal Perception, Scientific Attribute Understanding, and Scientific Comparative Reasoning. SFE comprises 830 expert-verified Visual Question Answering (VQA) pairs across 66 multimodal tasks spanning five high-value disciplines: Astronomy, Chemistry, Earth, Life, and Materials Sciences. All tasks are bilingual (English & Chinese) to support broad accessibility. These tasks are designed not only to require a deep understanding of domain-specific knowledge and data analysis skills but also to significantly enhance research efficiency and facilitate advancements that benefit society.
提供机构:
InternScience



