five

tubitak-science-olympiad-tr

收藏
Hugging Face2026-03-21 更新2026-03-23 收录
下载链接:
https://huggingface.co/datasets/ytu-ce-cosmos/tubitak-science-olympiad-tr
下载链接
链接失效反馈
官方服务:
资源简介:
TUBITAK科学奥赛数据集包含来自土耳其科学技术研究委员会(TUBITAK)科学奥赛的多选题和开放式科学问题,旨在作为评估大型语言模型(LLMs)在土耳其语言中高级分析、数学和计算推理能力的基准。数据集涵盖约2700个问题,分布在五个领域:计算机科学、物理、数学、中学计算机科学和中学数学。每个问题条目包含唯一标识符、学科领域、年份、考试阶段、问题编号、问题图像、解答图像(如有)、LaTeX格式的问题和解答文本、以及多个布尔标志和选项值。数据集特别适用于测试LLMs在非英语环境下的多步科学推理任务、多模态评估以及链式思维(CoT)能力。数据集在CC BY 4.0许可下发布,适用于研究和教育目的。

The TUBITAK Science Olympiad Dataset contains multiple-choice and open-ended scientific questions sourced from the competitions hosted by the Turkish Scientific and Technological Research Council (TUBITAK). It is designed as a benchmark for evaluating the advanced analytical, mathematical, and computational reasoning capabilities of Large Language Models (LLMs) in Turkish. The dataset covers approximately 2,700 questions distributed across five domains: Computer Science, Physics, Mathematics, Secondary School Computer Science, and Secondary School Mathematics. Each question entry includes a unique identifier, subject domain, year, exam stage, question number, question image, solution image (if available), LaTeX-formatted question and solution texts, as well as multiple boolean flags and option values. This dataset is particularly well-suited for testing LLMs on multi-step scientific reasoning tasks, multimodal evaluations, and Chain-of-Thought (CoT) capabilities in non-English environments. The dataset is released under the CC BY 4.0 license and is intended for research and educational purposes.
提供机构:
Yildiz Technical University Computer Engineering Department Cosmos Research Group
创建时间:
2026-03-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作