Novel Benchmark for In-Context Learning
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/mikelixiang88/context-matters.git
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个基准测试,包含了难度较高的科学问题,每个问题都配有一个与之相关度不同的上下文,以评估大型语言模型在上下文学习中的表现。该基准测试的目的是揭示大型语言模型在处理封闭式问题和开放式问题时的差异。具体任务是对开放式和封闭式问题在上下文学习中的表现进行评估。
This dataset is a benchmark test containing high-difficulty scientific questions, each paired with contexts of varying degrees of relevance to evaluate the in-context learning performance of large language models (LLMs). The core goal of this benchmark is to reveal the differences in how large language models handle closed-ended and open-ended questions. The specific task is to evaluate the performance of large language models on both closed-ended and open-ended questions during in-context learning.
提供机构:
Mikelixiang88



