SoAy Bench
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/RUCKBReasoning/SoAy
下载链接
链接失效反馈官方服务:
资源简介:
该数据集为评估大型语言模型在学术信息检索中使用API解决方案的能力提供了一个基准测试,以便将SoAy方法与现有方法在学术信息搜索环境中的表现进行比较。SoAy基准测试包含了详细的评估指标(包括精确度匹配、文档相似度、词相似度、词对匹配度和实体提取),用于衡量解决方案和答案的准确性。
This dataset serves as a benchmark for evaluating the ability of Large Language Models (LLMs) to utilize API-based solutions in academic information retrieval, enabling performance comparison between the SoAy method and existing approaches in academic information search scenarios. The SoAy benchmark includes detailed evaluation metrics, including precision matching, document similarity, word similarity, word pair matching, and entity extraction, which are used to measure the accuracy of solutions and their corresponding answers.
提供机构:
RUCKBReasoning



