OceanBench 海洋学基准测试评估数据集

超神经2024-08-02 更新2024-12-14 收录

下载链接：

https://hyper.ai/cn/datasets/33131

下载链接

链接失效反馈

官方服务：

资源简介：

OceanBench 是由浙江大学张宁豫、陈华钧团队于 2024 年设计的一个专门针对海洋学任务的基准测试评估数据集。这个数据集总共包括 15 种与海洋相关的任务，例如问答和描述任务，旨在全面评估大型语言模型 (LLM) 在海洋学领域的能力。 OceanBench 中的样本是通过自动化的方式从种子数据集生成，并经过专家的人工验证，以确保数据的专业性和准确性。

OceanBench is a specialized benchmark evaluation dataset for oceanographic tasks, developed by the research team led by Zhang Ningyu and Chen Huajun at Zhejiang University in 2024. This dataset encompasses 15 ocean-related tasks, including question answering and text description tasks, with the goal of comprehensively evaluating the performance of Large Language Models (LLMs) in the domain of oceanography. Samples in OceanBench are generated from a seed dataset through automated pipelines, followed by manual validation by domain experts to ensure the data's professionalism and accuracy.

创建时间：

2024-08-01

搜集汇总

数据集介绍

背景与挑战

背景概述

OceanBench是由浙江大学团队设计的海洋学基准测试数据集，包含15种海洋相关任务，用于评估大型语言模型在物理、化学、生物等海洋学细分领域的能力。该数据集通过自动化生成加专家验证的方式构建，并与OceanInstruct指令数据集配合用于训练专业海洋模型OceanGPT。

以上内容由遇见数据集搜集并总结生成