OpenCompass Evaluation Framework
收藏arXiv2025-09-30 收录
下载链接:
https://opencompass.readthedocs.io/en/latest/get_started/faq.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集提供了一个包含多个评估模型的框架,涵盖了推理、语言、知识、考试和理解等多个方面。此外,该数据集在五个评估方面进行了综合考量,并采用了零样本或小样本方法进行评估,无需进行额外的训练。其规模涉及多个基准测试,任务主要集中在模型评估上。
This dataset provides a framework for multi-model evaluation, covering multiple dimensions including reasoning, language, knowledge, examinations, and comprehension. Furthermore, it conducts comprehensive assessments across the five identified evaluation dimensions, adopting zero-shot or few-shot evaluation methods without requiring additional training. Its scope encompasses multiple benchmark datasets, with tasks primarily focused on model evaluation.
提供机构:
OpenCompass



