SAT English Exam
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/matthewrenze/self-reflection
下载链接
链接失效反馈官方服务:
资源简介:
该数据集用于评估自我反思型大型语言模型代理在SAT考试英语部分的表现。与LSAT-AR数据集相比,该数据集显示出自我反思对性能的影响较小。所涉及的任务是多项选择题解答(Mcqa)。
This dataset is designed to evaluate the performance of self-reflective large language model (LLM) agents on the English section of the SAT exam. Compared with the LSAT-AR dataset, this dataset demonstrates that self-reflection has a relatively smaller impact on model performance. The tasks involved are multiple-choice question answering (MCQA).



